ARCTraj: A Dataset and Benchmark of Human Reasoning Trajectories for Abstract Problem Solving

Sejin Kim; Hayan Choi; Seokki Lee; Sundong Kim

arXiv:2511.11079·cs.AI·February 17, 2026

ARCTraj: A Dataset and Benchmark of Human Reasoning Trajectories for Abstract Problem Solving

Sejin Kim, Hayan Choi, Seokki Lee, Sundong Kim

PDF

Open Access 1 Datasets

TL;DR

ARCTraj introduces a comprehensive dataset capturing human reasoning trajectories in complex visual tasks, enabling the study of reasoning processes over time and supporting various AI modeling approaches.

Contribution

It provides the first large-scale, temporally annotated dataset of human reasoning steps in the ARC, along with a unified framework for analyzing and modeling reasoning processes.

Findings

01

Reveals diverse reasoning strategies used by humans.

02

Demonstrates the dataset's utility across reinforcement learning and generative models.

03

Highlights the structure and complexity of human problem-solving trajectories.

Abstract

We present ARCTraj, a dataset and methodological framework for modeling human reasoning through complex visual tasks in the Abstraction and Reasoning Corpus (ARC). While ARC has inspired extensive research on abstract reasoning, most existing approaches rely on static input-output supervision, which limits insight into how reasoning unfolds over time. ARCTraj addresses this gap by recording temporally ordered, object-level actions that capture how humans iteratively transform inputs into outputs, revealing intermediate reasoning steps that conventional datasets overlook. Collected via the O2ARC web interface, it contains around 10,000 trajectories annotated with task identifiers, timestamps, and success labels across 400 training tasks from the ARC-AGI-1 benchmark. It further defines a unified reasoning pipeline encompassing data collection, action abstraction, Markov decision process…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

SejinKimm/ARCTraj
dataset· 34 dl
34 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Action Observation and Synchronization · Explainable Artificial Intelligence (XAI)