The ARC of Progress towards AGI: A Living Survey of Abstraction and Reasoning
Sahar Vahdati, Andrei Aioanei, Haridhra Suresh, Jens Lehmann

TL;DR
This survey analyzes 82 approaches to the ARC-AGI benchmark, revealing consistent performance drops across versions and highlighting the importance of test-time adaptation, while emphasizing the ongoing challenges in compositional reasoning for achieving AGI.
Contribution
It provides the first comprehensive cross-generation analysis of ARC-AGI approaches, tracking progress, costs, and limitations across multiple benchmark versions and competitions.
Findings
Performance drops 2-3x across benchmark versions
Cost reduced 390x in one year, mainly due to test-time efficiency
Test-time adaptation is crucial for success
Abstract
The Abstraction and Reasoning Corpus (ARC-AGI) has become a key benchmark for fluid intelligence in AI. This survey presents the first cross-generation analysis of 82 approaches across three benchmark versions and the ARC Prize 2024-2025 competitions. Our central finding is that performance degradation across versions is consistent across all paradigms: program synthesis, neuro-symbolic, and neural approaches all exhibit 2-3x drops from ARC-AGI-1 to ARC-AGI-2, indicating fundamental limitations in compositional generalization. While systems now reach 93.0% on ARC-AGI-1 (Opus 4.6), performance falls to 68.8% on ARC-AGI-2 and 13% on ARC-AGI-3, as humans maintain near-perfect accuracy across all versions. Cost fell 390x in one year (o3's 12/task), although this largely reflects reduced test-time parallelism. Trillion-scale models vary widely in score and cost,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications
