Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus
Seungpil Lee, Woochang Sim, Donghyeon Shin, Wongyu Seo and, Jiwon Park, Seokki Lee, Sanha Hwang, Sejin Kim, Sundong Kim

TL;DR
This paper introduces a process-centric evaluation method for LLM reasoning abilities using the ARC benchmark, focusing on logical coherence, compositionality, and productivity, revealing gaps compared to human reasoning.
Contribution
It presents a novel approach based on the Language of Thought Hypothesis to assess reasoning processes, not just results, in LLMs.
Findings
LLMs show some inference ability but lag behind humans in reasoning.
The LoTH perspective offers new insights into AI reasoning development.
Evaluation reveals significant gaps in LLM reasoning capabilities.
Abstract
The existing methods for evaluating the inference abilities of Large Language Models (LLMs) have been predominantly results-centric, making it challenging to assess the inference process comprehensively. We introduce a novel approach using the Abstraction and Reasoning Corpus (ARC) benchmark to evaluate the inference and contextual understanding abilities of LLMs in a process-centric manner, focusing on three key components from the Language of Thought Hypothesis (LoTH): Logical Coherence, Compositionality, and Productivity. Our carefully designed experiments reveal that while LLMs demonstrate some inference capabilities, they still significantly lag behind human-level reasoning in these three aspects. The main contribution of this paper lies in introducing the LoTH perspective, which provides a method for evaluating the reasoning process that conventional results-oriented approaches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
