Reasoning Capabilities and Invariability of Large Language Models

Alessandro Raganato; Rafael Pe\~naloza; Marco Viviani; Gabriella Pasi

arXiv:2505.00776·cs.CL·May 5, 2025

Reasoning Capabilities and Invariability of Large Language Models

Alessandro Raganato, Rafael Pe\~naloza, Marco Viviani, Gabriella Pasi

PDF

Open Access 1 Repo

TL;DR

This paper evaluates the reasoning abilities of large language models using a new geometric reasoning benchmark, revealing their strengths and limitations in zero-shot and chain-of-thought prompting scenarios.

Contribution

It introduces a novel benchmark dataset for simple geometric reasoning tasks and provides a comprehensive empirical analysis of LLMs' reasoning capabilities and prompt dependency.

Findings

01

LLMs over 70B parameters perform better in zero-shot settings

02

Chain-of-thought prompting can improve or impair performance depending on implementation

03

Significant room for improvement remains in LLM reasoning abilities

Abstract

Large Language Models (LLMs) have shown remarkable capabilities in manipulating natural language across multiple applications, but their ability to handle simple reasoning tasks is often questioned. In this work, we aim to provide a comprehensive analysis of LLMs' reasoning competence, specifically focusing on their prompt dependency. In particular, we introduce a new benchmark dataset with a series of simple reasoning questions demanding shallow logical reasoning. Aligned with cognitive psychology standards, the questions are confined to a basic domain revolving around geometric figures, ensuring that responses are independent of any pre-existing intuition about the world and rely solely on deduction. An empirical analysis involving zero-shot and few-shot prompting across 24 LLMs of different sizes reveals that, while LLMs with over 70 billion parameters perform better in the zero-shot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ikr3-lab/ReasoningLLMs
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques