Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?

Anthony GX-Chen; Dongyan Lin; Mandana Samiei; Doina Precup; Blake A. Richards; Rob Fergus; Kenneth Marino

arXiv:2505.09614·cs.AI·October 7, 2025

Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?

Anthony GX-Chen, Dongyan Lin, Mandana Samiei, Doina Precup, Blake A. Richards, Rob Fergus, Kenneth Marino

PDF

Open Access

TL;DR

This paper investigates whether language model agents can effectively explore and infer causal relationships, revealing biases similar to humans and proposing a sampling method to improve their scientific reasoning capabilities.

Contribution

The study demonstrates that LMs exhibit a disjunctive reasoning bias and introduces a test-time sampling method to mitigate this bias, enhancing causal inference.

Findings

01

LMS reliably infer disjunctive causal relationships

02

LMS struggle with conjunctive causal relationships

03

Proposed sampling method reduces disjunctive bias

Abstract

Language model (LM) agents are increasingly used as autonomous decision-makers which need to actively gather information to guide their decisions. A crucial cognitive skill for such agents is the efficient exploration and understanding of the causal structure of the world -- key to robust, scientifically grounded reasoning. Yet, it remains unclear whether LMs possess this capability or exhibit systematic biases leading to erroneous conclusions. In this work, we examine LMs' ability to explore and infer causal relationships, using the well-established Blicket Test paradigm from developmental psychology. We find that LMs reliably infer the common, intuitive disjunctive causal relationships but systematically struggle with the unusual, yet equally (or sometimes even more) evidenced conjunctive ones. This "disjunctive bias" persists across model families, sizes, and prompting strategies,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution