Evaluation of Deontic Conditional Reasoning in Large Language Models: The Case of Wason's Selection Task

Hirohiko Abe; Kentaro Ozeki; Risako Ando; Takanobu Morishita; Koji Mineshima; Mitsuhiro Okada

arXiv:2603.06416·cs.CL·March 9, 2026

Evaluation of Deontic Conditional Reasoning in Large Language Models: The Case of Wason's Selection Task

Hirohiko Abe, Kentaro Ozeki, Risako Ando, Takanobu Morishita, Koji Mineshima, Mitsuhiro Okada

PDF

Open Access 1 Video

TL;DR

This study evaluates large language models' reasoning in deontic contexts using a new Wason Selection Task dataset, revealing that LLMs reason better with deontic rules and exhibit human-like matching bias errors.

Contribution

Introduces a novel dataset for deontic reasoning in LLMs and analyzes their reasoning patterns, highlighting similarities to human biases in rule-based tasks.

Findings

01

LLMs perform better with deontic rules

02

LLMs exhibit matching bias errors

03

Performance varies systematically across rule types

Abstract

As large language models (LLMs) advance in linguistic competence, their reasoning abilities are gaining increasing attention. In humans, reasoning often performs well in domain specific settings, particularly in normative rather than purely formal contexts. Although prior studies have compared LLM and human reasoning, the domain specificity of LLM reasoning remains underexplored. In this study, we introduce a new Wason Selection Task dataset that explicitly encodes deontic modality to systematically distinguish deontic from descriptive conditionals, and use it to examine LLMs' conditional reasoning under deontic rules. We further analyze whether observed error patterns are better explained by confirmation bias (a tendency to seek rule-supporting evidence) or by matching bias (a tendency to ignore negation and select items that lexically match elements of the rule). Results show that,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Evaluation of Deontic Conditional Reasoning in Large Language Models: The Case of Wason's Selection Task· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification