When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it
Sebastian Schuster, Tal Linzen

TL;DR
This paper evaluates how well Transformer-based models like GPT-2 and GPT-3 understand discourse entities, especially in complex sentences with negation, revealing their limitations in systematic entity tracking.
Contribution
It introduces a novel evaluation suite for assessing discourse entity tracking in language models and provides a detailed analysis of GPT-2 and GPT-3's capabilities and shortcomings.
Findings
Models show partial sensitivity to sentential operators and indefinite NPs.
Performance declines with multiple NPs, indicating difficulty in complex scenarios.
GPT-3 does not fully acquire basic entity tracking abilities.
Abstract
Understanding longer narratives or participating in conversations requires tracking of discourse entities that have been mentioned. Indefinite noun phrases (NPs), such as 'a dog', frequently introduce discourse entities but this behavior is modulated by sentential operators such as negation. For example, 'a dog' in 'Arthur doesn't own a dog' does not introduce a discourse entity due to the presence of negation. In this work, we adapt the psycholinguistic assessment of language models paradigm to higher-level linguistic phenomena and introduce an English evaluation suite that targets the knowledge of the interactions between sentential operators and indefinite NPs. We use this evaluation suite for a fine-grained investigation of the entity tracking abilities of the Transformer-based models GPT-2 and GPT-3. We find that while the models are to a certain extent sensitive to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · Discriminative Fine-Tuning · Byte Pair Encoding · Residual Connection · Linear Warmup With Cosine Annealing · Attention Dropout
