Mary, the Cheeseburger-Eating Vegetarian: Do LLMs Recognize Incoherence in Narratives?
Karin de Langis, P\"uren \"Oncel, Ryan Peters, Andrew Elfenbein, Laura Kristen Allen, Andreas Schramm, Dongyeop Kang

TL;DR
This paper investigates how well large language models can distinguish coherent from incoherent narratives, revealing they recognize incoherence internally but struggle to reflect this understanding in responses, highlighting gaps in narrative comprehension.
Contribution
The study demonstrates that LLMs' internal representations can identify incoherence, but their responses often fail to differentiate coherence levels, exposing limitations in their narrative understanding.
Findings
LLMs' internal states reliably detect incoherence.
Responses to rating questions often do not distinguish coherence.
LLMs are more sensitive to setting violations than character trait violations.
Abstract
Leveraging a dataset of paired narratives, we investigate the extent to which large language models (LLMs) can reliably separate incoherent and coherent stories. A probing study finds that LLMs' internal representations can reliably identify incoherent narratives. However, LLMs generate responses to rating questions that fail to satisfactorily separate the coherent and incoherent narratives across several prompt variations, hinting at a gap in LLM's understanding of storytelling. The reasoning LLMs tested do not eliminate these deficits, indicating that thought strings may not be able to fully address the discrepancy between model internal state and behavior. Additionally, we find that LLMs appear to be more sensitive to incoherence resulting from an event that violates the setting (e.g., a rainy day in the desert) than to incoherence arising from a character violating an established…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAction Observation and Synchronization · Narrative Theory and Analysis · Psychology of Moral and Emotional Judgment
