From Growing to Looping: A Unified View of Iterative Computation in LLMs
Ferdinand Kapl, Emmanouil Angelis, Kaitlin Maile, Johannes von Oswald, Stefan Bauer

TL;DR
This paper unifies the concepts of looping and depth growing in large language models, showing they share underlying mechanisms and can be combined to enhance reasoning capabilities through iterative computation.
Contribution
It provides a mechanistic unification of looping and depth growing, demonstrating their shared signatures and practical benefits when combined for improved reasoning.
Findings
Looped and depth-grown models show convergent depth-wise signatures.
Applying inference-time looping improves accuracy on reasoning tasks.
Depth-grown models benefit from higher-quality training mixtures.
Abstract
Looping, reusing a block of layers across depth, and depth growing, training shallow-to-deep models by duplicating middle layers, have both been linked to stronger reasoning, but their relationship remains unclear. We provide a mechanistic unification: looped and depth-grown models exhibit convergent depth-wise signatures, including increased reliance on late layers and recurring patterns aligned with the looped or grown block. These shared signatures support the view that their gains stem from a common form of iterative computation. Building on this connection, we show that the two techniques are adaptable and composable: applying inference-time looping to the middle blocks of a depth-grown model improves accuracy on some reasoning primitives by up to , despite the model never being trained to loop. Both approaches also adapt better than the baseline when given more in-context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Advanced Graph Neural Networks · Topic Modeling
