Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism
Shi Zong, Jimmy Lin

TL;DR
This paper reviews how large language models perform in logical reasoning tasks involving categorical syllogisms, highlighting current challenges and suggesting directions for future research to improve evaluation methods.
Contribution
It systematically analyzes the variations of categorical syllogisms tested by existing datasets and discusses the limitations of current evaluation approaches for LLMs' reasoning abilities.
Findings
Crowdsourcing datasets sacrifice configuration coverage for language variation.
Interpretation of quantifiers is the main bottleneck in LLM reasoning.
Current datasets do not fully test all logical configurations.
Abstract
There have been a huge number of benchmarks proposed to evaluate how large language models (LLMs) behave for logic inference tasks. However, it remains an open question how to properly evaluate this ability. In this paper, we provide a systematic overview of prior works on the logical reasoning ability of LLMs for analyzing categorical syllogisms. We first investigate all the possible variations for the categorical syllogisms from a purely logical perspective and then examine the underlying configurations (i.e., mood and figure) tested by the existing datasets. Our results indicate that compared to template-based synthetic datasets, crowdsourcing approaches normally sacrifice the coverage of configurations (i.e., mood and figure) of categorical syllogisms for more language variations, thus bringing challenges to fully testing LLMs under different situations. We then proceed to summarize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Natural Language Processing Techniques
