Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives

Kentaro Ozeki; Risako Ando; Takanobu Morishita; Hirohiko Abe; Koji Mineshima; Mitsuhiro Okada

arXiv:2510.26606·cs.AI·November 3, 2025

Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives

Kentaro Ozeki, Risako Ando, Takanobu Morishita, Hirohiko Abe, Koji Mineshima, Mitsuhiro Okada

PDF

1 Video

TL;DR

This paper evaluates large language models' ability to perform normative reasoning from logical and modal perspectives, revealing their strengths and inconsistencies, and providing a new benchmark dataset for future research.

Contribution

It introduces a comprehensive dataset and systematic evaluation framework for normative reasoning in LLMs, comparing their performance on normative and epistemic modalities.

Findings

01

LLMs generally follow valid reasoning patterns

02

Notable inconsistencies in normative reasoning types

03

Presence of cognitive biases similar to humans

Abstract

Normative reasoning is a type of reasoning that involves normative or deontic modality, such as obligation and permission. While large language models (LLMs) have demonstrated remarkable performance across various reasoning tasks, their ability to handle normative reasoning remains underexplored. In this paper, we systematically evaluate LLMs' reasoning capabilities in the normative domain from both logical and modal perspectives. Specifically, to assess how well LLMs reason with normative modals, we make a comparison between their reasoning with normative modals and their reasoning with epistemic modals, which share a common formal structure. To this end, we introduce a new dataset covering a wide range of formal patterns of reasoning in both normative and epistemic domains, while also incorporating non-formal cognitive factors that influence human reasoning. Our results indicate that,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives· underline