Does Entity Abstraction Help Generative Transformers Reason?
Nicolas Gontier, Siva Reddy, Christopher Pal

TL;DR
This paper investigates whether incorporating entity type abstractions into pre-trained Transformers improves their logical reasoning abilities across various NLP tasks, finding significant gains in formal reasoning but limited benefits in less structured tasks.
Contribution
The study introduces three methods to incorporate entity abstractions into Transformers and empirically evaluates their impact on multiple reasoning tasks, highlighting where they are most effective.
Findings
Entity abstraction improves performance on formal reasoning tasks.
Best models achieved 88.8% and 91.8% accuracy on CLUTRR and ProofWriter.
Limited improvement (0.5% F1) observed on less formal NLP tasks.
Abstract
We study the utility of incorporating entity type abstractions into pre-trained Transformers and test these methods on four NLP tasks requiring different forms of logical reasoning: (1) compositional language understanding with text-based relational reasoning (CLUTRR), (2) abductive reasoning (ProofWriter), (3) multi-hop question answering (HotpotQA), and (4) conversational question answering (CoQA). We propose and empirically explore three ways to add such abstraction: (i) as additional input embeddings, (ii) as a separate sequence to encode, and (iii) as an auxiliary prediction task for the model. Overall, our analysis demonstrates that models with abstract entity knowledge performs better than without it. The best abstraction aware models achieved an overall accuracy of 88.8% and 91.8% compared to the baseline model achieving 62.9% and 89.8% on CLUTRR and ProofWriter respectively.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsAttentive Walk-Aggregating Graph Neural Network
