Understanding by Understanding Not: Modeling Negation in Language Models

Arian Hosseini; Siva Reddy; Dzmitry Bahdanau; R Devon Hjelm,; Alessandro Sordoni; Aaron Courville

arXiv:2105.03519·cs.CL·May 11, 2021·1 cites

Understanding by Understanding Not: Modeling Negation in Language Models

Arian Hosseini, Siva Reddy, Dzmitry Bahdanau, R Devon Hjelm,, Alessandro Sordoni, Aaron Courville

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to improve language models' handling of negation by augmenting training with an unlikelihood objective based on negated sentences, significantly reducing errors on negation tasks.

Contribution

It proposes a novel training objective combining likelihood and unlikelihood to enhance negation understanding in pre-trained language models.

Findings

01

Reduced top-1 error rate to 4% on negated LAMA dataset

02

Improved performance on negated NLI benchmarks

03

Demonstrated effectiveness of unlikelihood training for negation

Abstract

Negation is a core construction in natural language. Despite being very successful on many tasks, state-of-the-art pre-trained language models often handle negation incorrectly. To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus. By training BERT with the resulting combined objective we reduce the mean top~1 error rate to 4% on the negated LAMA dataset. We also see some improvements on the negated NLI benchmarks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arianhosseini/negation-learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Tanh Activation · Softmax · Layer Normalization · Linear Warmup With Linear Decay · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections