Do large language models and humans have similar behaviors in causal inference with script knowledge?
Xudong Hong, Margarita Ryzhova, Daniel Adrian Biondi, Vera Demberg

TL;DR
This study compares human and large language model behaviors in causal inference within script-based stories, revealing that recent models partially mimic human responses but still struggle with integrating script knowledge.
Contribution
It provides a systematic comparison of human and LLM causal reasoning in script contexts, highlighting current models' limitations and partial alignment with human behavior.
Findings
Humans show longer reading times for causal conflicts.
Recent LLMs like GPT-3 correlate with human responses in some conditions.
All models fail to predict the lower surprise of no cause event.
Abstract
Recently, large pre-trained language models (LLMs) have demonstrated superior language understanding abilities, including zero-shot causal reasoning. However, it is unclear to what extent their capabilities are similar to human ones. We here study the processing of an event in a script-based story, which causally depends on a previous event . In our manipulation, event is stated, negated, or omitted in an earlier section of the text. We first conducted a self-paced reading experiment, which showed that humans exhibit significantly longer reading times when causal conflicts exist () than under logical conditions (). However, reading times remain similar when cause A is not explicitly mentioned, indicating that humans can easily infer event B from their script knowledge. We then tested a variety of LLMs on the same data to check to what…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsSparse Evolutionary Training · Refunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Residual Connection · Byte Pair Encoding · Dropout
