One Step Forward and K Steps Back: Better Reasoning with Denoising Recursion Models

Chris Cameron; Wangzheng Wang; Nikita Ivanov; Ashmita Bhattacharyya; Didier Ch\'etelat; Yingxue Zhang

arXiv:2604.18839·cs.LG·April 22, 2026

One Step Forward and K Steps Back: Better Reasoning with Denoising Recursion Models

Chris Cameron, Wangzheng Wang, Nikita Ivanov, Ashmita Bhattacharyya, Didier Ch\'etelat, Yingxue Zhang

PDF

TL;DR

This paper introduces Denoising Recursion Models, which improve iterative refinement in reasoning tasks by training models to reverse noise over multiple steps, leading to better alignment and performance.

Contribution

The paper proposes a novel training method for recursive models that enhances reasoning by reversing noise over multiple steps, outperforming previous models on complex tasks.

Findings

01

Outperforms Tiny Recursion Model on ARC-AGI

02

Better alignment of training and testing behaviors

03

Encourages non-greedy, forward-looking generation

Abstract

Looped transformers scale computational depth without increasing parameter count by repeatedly applying a shared transformer block and can be used for iterative refinement, where each loop rewrites a full fixed-size prediction in parallel. On difficult problems, such as those that require search-like computation, reaching a highly structured solution starting from noise can require long refinement trajectories. Learning such trajectories is challenging when training specifies only the target solution and provides no supervision over the intermediate refinement path. Diffusion models tackle this issue by corrupting data with varying magnitudes of noise and training the model to reverse it in a \textit{single step}. However, this process misaligns training and testing behaviour. We introduce Denoising Recursion Models, a method that similarly corrupts data with noise but trains the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.