Understanding LLMs' Fluid Intelligence Deficiency: An Analysis of the   ARC Task

Junjie Wu; Mo Yu; Lemao Liu; Dit-Yan Yeung; Jie Zhou

arXiv:2502.07190·cs.AI·March 4, 2025

Understanding LLMs' Fluid Intelligence Deficiency: An Analysis of the ARC Task

Junjie Wu, Mo Yu, Lemao Liu, Dit-Yan Yeung, Jie Zhou

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the limitations of large language models in demonstrating fluid intelligence, focusing on the ARC task, and identifies key challenges such as skill composition, abstract input formats, and decoding strategies.

Contribution

The study provides a detailed analysis of LLMs' deficiencies in fluid intelligence using the ARC task, highlighting specific challenges and limitations.

Findings

01

LLMs have limited skill composition abilities.

02

Unfamiliarity with abstract input formats hampers performance.

03

Intrinsic left-to-right decoding limits fluid intelligence assessment.

Abstract

While LLMs have exhibited strong performance on various NLP tasks, it is noteworthy that most of these tasks rely on utilizing the vast amount of knowledge encoded in LLMs' parameters, rather than solving new problems without prior knowledge. In cognitive research, the latter ability is referred to as fluid intelligence, which is considered to be critical for assessing human intelligence. Recent research on fluid intelligence assessments has highlighted significant deficiencies in LLMs' abilities. In this paper, we analyze the challenges LLMs face in demonstrating fluid intelligence through controlled experiments, using the most representative ARC task as an example. Our study revealed three major limitations in existing LLMs: limited ability for skill composition, unfamiliarity with abstract input formats, and the intrinsic deficiency of left-to-right decoding. Our data and code can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wujunjie1998/ref-long
pytorch

Videos

Understanding LLMs’ Fluid Intelligence Deficiency: An Analysis of the ARC Task· underline

Taxonomy

TopicsTechnology and Data Analysis