TL;DR
This paper introduces Equilibrium Reasoners, a scalable reasoning framework that learns task-specific attractors in latent space, enabling efficient and adaptive test-time computation for complex reasoning tasks.
Contribution
It formalizes the concept of learned attractors in iterative models and demonstrates their effectiveness in scaling reasoning performance without external verifiers.
Findings
Test-time scaling improves accuracy from 2.6% to over 99% on Sudoku-Extreme.
Stronger convergence to attractors correlates with better reasoning performance.
Harder tasks benefit significantly from extensive test-time iterations.
Abstract
Scaling test-time compute by iteratively updating a latent state has emerged as a powerful paradigm for reasoning. Yet the internal mechanisms that enable these iterative models to generalize beyond memorized patterns remain unclear. We hypothesize that generalizable reasoning arises from learning task-conditioned attractors: latent dynamical systems whose stable fixed points correspond to valid solutions. We formalize this process through Equilibrium Reasoners (EqR), which enable test-time scaling without external verifiers or task-specific priors. EqR scales internal dynamics along two axes: depth, by running more iterations, and breadth, by aggregating stochastic trajectories from multiple initializations. Empirically, gains from test-time scaling are tightly coupled with stronger convergence toward solution-aligned attractors. This attractor perspective allows neural networks to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
