Finding Sense in Nonsense with Generated Contexts: Perspectives from Humans and Language Models
Katrina Olsen, Sebastian Pad\'o

TL;DR
This paper investigates how humans and language models distinguish between anomalous and truly nonsensical sentences, revealing that most sentences are seen as anomalous and that LLMs can generate plausible contexts for them.
Contribution
It provides a systematic comparison of human and LLM sensicality judgments on deviant sentences, highlighting LLMs' ability to generate plausible contexts for anomalous cases.
Findings
Humans consider most deviant sentences as anomalous, not nonsensical.
LLMs can generate plausible contexts for anomalous sentences.
Most sentences in the datasets are perceived as anomalous rather than nonsensical.
Abstract
Nonsensical and anomalous sentences have been instrumental in the development of computational models of semantic interpretation. A core challenge is to distinguish between what is merely anomalous (but can be interpreted given a supporting context) and what is truly nonsensical. However, it is unclear (a) how nonsensical, rather than merely anomalous, existing datasets are; and (b) how well LLMs can make this distinction. In this paper, we answer both questions by collecting sensicality judgments from human raters and LLMs on sentences from five semantically deviant datasets: both context-free and when providing a context. We find that raters consider most sentences at most anomalous, and only a few as properly nonsensical. We also show that LLMs are substantially skilled in generating plausible contexts for anomalous cases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques
