ARC Prize 2024: Technical Report

Francois Chollet; Mike Knoop; Gregory Kamradt; Bryan Landers

arXiv:2412.04604·cs.AI·January 9, 2025·3 cites

ARC Prize 2024: Technical Report

Francois Chollet, Mike Knoop, Gregory Kamradt, Bryan Landers

PDF

Open Access 1 Repo 1 Video

TL;DR

The ARC Prize 2024 report highlights progress in AI generalization benchmarks, showcasing new techniques and insights from a global competition aimed at advancing towards artificial general intelligence.

Contribution

This paper introduces the ARC Prize 2024 competition, reviews top approaches, open-source tools, and discusses limitations and insights related to the ARC-AGI benchmark.

Findings

01

State-of-the-art score increased from 33% to 55.5%.

02

Frontier reasoning techniques like deep learning-guided program synthesis improved performance.

03

Insights into limitations of the ARC-AGI-1 dataset were shared.

Abstract

As of December 2024, the ARC-AGI benchmark is five years old and remains unbeaten. We believe it is currently the most important unsolved AI benchmark in the world because it seeks to measure generalization on novel tasks -- the essence of intelligence -- as opposed to skill at tasks that can be prepared for in advance. This year, we launched ARC Prize, a global competition to inspire new ideas and drive open progress towards AGI by reaching a target benchmark score of 85\%. As a result, the state-of-the-art score on the ARC-AGI private evaluation set increased from 33\% to 55.5\%, propelled by several frontier AGI reasoning techniques including deep learning-guided program synthesis and test-time training. In this paper, we survey top approaches, review new open-source implementations, discuss the limitations of the ARC-AGI-1 dataset, and share key insights gained from the competition.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

michaelhodel/re-arc
noneOfficial

Videos

Can Latent Program Networks Solve Abstract Reasoning? [Clement Bonnet]· youtube

Taxonomy

TopicsBig Data and Digital Economy · Machine Learning and Data Classification · Explainable Artificial Intelligence (XAI)

MethodsSparse Evolutionary Training