Understanding by Understanding Not: Modeling Negation in Language Models
Arian Hosseini, Siva Reddy, Dzmitry Bahdanau, R Devon Hjelm,, Alessandro Sordoni, Aaron Courville

TL;DR
This paper introduces a method to improve language models' handling of negation by augmenting training with an unlikelihood objective based on negated sentences, significantly reducing errors on negation tasks.
Contribution
It proposes a novel training objective combining likelihood and unlikelihood to enhance negation understanding in pre-trained language models.
Findings
Reduced top-1 error rate to 4% on negated LAMA dataset
Improved performance on negated NLI benchmarks
Demonstrated effectiveness of unlikelihood training for negation
Abstract
Negation is a core construction in natural language. Despite being very successful on many tasks, state-of-the-art pre-trained language models often handle negation incorrectly. To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus. By training BERT with the resulting combined objective we reduce the mean top~1 error rate to 4% on the negated LAMA dataset. We also see some improvements on the negated NLI benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Tanh Activation · Softmax · Layer Normalization · Linear Warmup With Linear Decay · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections
