Evaluating Large Language Models with NeuBAROCO: Syllogistic Reasoning   Ability and Human-like Biases

Risako Ando; Takanobu Morishita; Hirohiko Abe; Koji Mineshima,; Mitsuhiro Okada

arXiv:2306.12567·cs.CL·June 23, 2023·2 cites

Evaluating Large Language Models with NeuBAROCO: Syllogistic Reasoning Ability and Human-like Biases

Risako Ando, Takanobu Morishita, Hirohiko Abe, Koji Mineshima,, Mitsuhiro Okada

PDF

Open Access

TL;DR

This study assesses whether large language models exhibit human-like biases in syllogistic reasoning using the NeuBAROCO dataset, revealing that models struggle more with belief biases, conversion errors, and atmosphere effects.

Contribution

Introduces NeuBAROCO, a bilingual syllogistic reasoning dataset, and evaluates LLMs' biases in logical inference, highlighting their limitations in human-like reasoning biases.

Findings

01

LLMs struggle with belief biases in syllogistic reasoning.

02

Models exhibit difficulty with conversion errors.

03

Performance drops on problems involving atmosphere effects.

Abstract

This paper investigates whether current large language models exhibit biases in logical reasoning, similar to humans. Specifically, we focus on syllogistic reasoning, a well-studied form of inference in the cognitive science of human deduction. To facilitate our analysis, we introduce a dataset called NeuBAROCO, originally designed for psychological experiments that assess human logical abilities in syllogistic reasoning. The dataset consists of syllogistic inferences in both English and Japanese. We examine three types of biases observed in human syllogistic reasoning: belief biases, conversion errors, and atmosphere effects. Our findings demonstrate that current large language models struggle more with problems involving these three types of biases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsFocus