He Thinks He Knows Better than the Doctors: BERT for Event Factuality Fails on Pragmatics
Nanjiang Jiang, Marie-Catherine de Marneffe

TL;DR
This paper evaluates BERT's ability to predict factuality in English datasets, revealing that while it performs well superficially, it struggles with pragmatic reasoning, indicating the need for more robust models.
Contribution
The study highlights BERT's reliance on surface patterns and its limitations in pragmatic reasoning for factuality prediction.
Findings
BERT performs well on most datasets but relies on surface cues.
BERT fails on pragmatic reasoning cases.
Current models are not robust for factuality prediction.
Abstract
We investigate how well BERT performs on predicting factuality in several existing English datasets, encompassing various linguistic constructions. Although BERT obtains a strong performance on most datasets, it does so by exploiting common surface patterns that correlate with certain factuality labels, and it fails on instances where pragmatic reasoning is necessary. Contrary to what the high performance suggests, we are still far from having a robust system for factuality prediction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Sentiment Analysis and Opinion Mining
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Layer Normalization · Weight Decay · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Residual Connection
