Causal Reasoning of Entities and Events in Procedural Texts
Li Zhang, Hainiu Xu, Yue Yang, Shuyan Zhou, Weiqiu You, Manni Arora, and Chris Callison-Burch

TL;DR
This paper introduces CREPE, a benchmark for causal reasoning in procedural texts, revealing current language models' limitations and demonstrating improved performance through code-like and chain-of-thought prompting techniques.
Contribution
The paper presents CREPE, the first benchmark for causal reasoning of entities and events, and proposes novel prompting methods that significantly enhance model performance.
Findings
Language models perform poorly on CREPE, with .35 F1.
Code-like prompting improves performance to .59 F1.
Combining causal relation injection with chain-of-thought boosts F1 to .67.
Abstract
Entities and events are crucial to natural language reasoning and common in procedural texts. Existing work has focused either exclusively on entity state tracking (e.g., whether a pan is hot) or on event reasoning (e.g., whether one would burn themselves by touching the pan), while these two tasks are often causally related. We propose CREPE, the first benchmark on causal reasoning of event plausibility and entity states. We show that most language models, including GPT-3, perform close to chance at .35 F1, lagging far behind human at .87 F1. We boost model performance to .59 F1 by creatively representing events as programming languages while prompting language models pretrained on code. By injecting the causal relations between entities and events as intermediate reasoning steps in our representation, we further boost the performance to .67 F1. Our findings indicate not only the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Multi-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Cosine Annealing · Dense Connections · {Dispute@FaQ-s}How to file a dispute with Expedia? · Linear Warmup With Cosine Annealing
