The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail
Samuel R. Bowman

TL;DR
This paper highlights the risks of underreporting NLP system failures, emphasizing the importance of accurate claims to maintain credibility and effectively address current and future challenges in the field.
Contribution
It advocates for more careful reporting of NLP system limitations and proposes strategies to improve communication and prevent misleading claims.
Findings
Underreporting failures harms credibility and trust.
Accurate failure reporting can improve system development and public understanding.
Strategies for better communication of NLP limitations are proposed.
Abstract
Researchers in NLP often frame and discuss research results in ways that serve to deemphasize the field's successes, often in response to the field's widespread hype. Though well-meaning, this has yielded many misleading or false claims about the limits of our best technology. This is a problem, and it may be more serious than it looks: It harms our credibility in ways that can make it harder to mitigate present-day harms, like those involving biased systems for content moderation or resume screening. It also limits our ability to prepare for the potentially enormous impacts of more distant future advances. This paper urges researchers to be careful about these claims and suggests some research directions and communication strategies that will make it easier to avoid or rebut them.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
MethodsAttention Is All You Need · Softmax · RAdam · Graph Self-Attention · Hyperboloid Embeddings
