Boule or Baguette? A Study on Task Topology, Length Generalization, and the Benefit of Reasoning Traces
William L. Tong, Ege Cakar, Cengiz Pehlevan

TL;DR
This paper introduces PITA, a large dataset for propositional logic reasoning, and investigates how reasoning traces affect models' ability to generalize to longer proofs, revealing strengths in broad tasks and limitations in deep ones.
Contribution
The paper presents PITA, a new large-scale reasoning dataset, and analyzes the impact of reasoning traces on length generalization, highlighting their benefits and limitations.
Findings
RT models excel on broad, shallow tasks
RT models struggle with narrow, deep tasks
Generalization performance depends on task breadth and depth
Abstract
Recent years have witnessed meteoric progress in reasoning models: neural networks that generate intermediate reasoning traces (RTs) before producing a final output. Despite the rapid advancement, our understanding of how RTs support reasoning, and the limits of this paradigm, remain incomplete. To promote greater clarity, we introduce PITA: a novel large-scale dataset of over 23 million statements in propositional logic and their corresponding proofs. As a benchmark for robust reasoning, we focus on length generalization: if a model is trained to determine truth or falsity on statements with proofs up to fixed length, how well does it generalize to statements requiring longer proofs? We propose notions of (1) task depth and (2) task breadth, which measure respectively (1) the number of steps required to solve an example from a task and (2) the number of unique examples across a task.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Constraint Satisfaction and Optimization
