TL;DR
This paper evaluates large language models' ability to perform normative reasoning from logical and modal perspectives, revealing their strengths and inconsistencies, and providing a new benchmark dataset for future research.
Contribution
It introduces a comprehensive dataset and systematic evaluation framework for normative reasoning in LLMs, comparing their performance on normative and epistemic modalities.
Findings
LLMs generally follow valid reasoning patterns
Notable inconsistencies in normative reasoning types
Presence of cognitive biases similar to humans
Abstract
Normative reasoning is a type of reasoning that involves normative or deontic modality, such as obligation and permission. While large language models (LLMs) have demonstrated remarkable performance across various reasoning tasks, their ability to handle normative reasoning remains underexplored. In this paper, we systematically evaluate LLMs' reasoning capabilities in the normative domain from both logical and modal perspectives. Specifically, to assess how well LLMs reason with normative modals, we make a comparison between their reasoning with normative modals and their reasoning with epistemic modals, which share a common formal structure. To this end, we introduce a new dataset covering a wide range of formal patterns of reasoning in both normative and epistemic domains, while also incorporating non-formal cognitive factors that influence human reasoning. Our results indicate that,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
