When to Ponder: Adaptive Compute Allocation for Code Generation via Test-Time Training

Gihyeon Sim

arXiv:2601.00894·cs.LG·January 6, 2026

When to Ponder: Adaptive Compute Allocation for Code Generation via Test-Time Training

Gihyeon Sim

PDF

Open Access

TL;DR

This paper introduces PonderTTT, a training-free gating strategy that adaptively applies test-time training to large language models for code generation, improving performance especially on out-of-distribution languages.

Contribution

It proposes a novel, training-free gating mechanism using reconstruction loss to selectively trigger test-time training in language models.

Findings

01

Achieves 82-89% oracle recovery rate

02

Significantly outperforms random skip baselines

03

Requires no ground-truth labels during inference

Abstract

Large language models apply uniform computation to all inputs, regardless of difficulty. We propose PonderTTT, a gating strategy using the TTT layer's self-supervised reconstruction loss to selectively trigger Test-Time Training (TTT) updates. The gating decision itself is training-free--requiring no learned classifier or auxiliary networks; only a single scalar threshold is initially calibrated on unlabeled data and continuously adapted via EMA to maintain target update rates. Our experiments with GPT-2 models (124M to 1.5B) on code language modeling (The Stack v2, teacher-forced perplexity) demonstrate that this signal is inference-compatible, requiring no ground-truth labels. Our Reconstruction Gating achieves 82-89% Oracle Recovery while being fully training-free, significantly outperforming Random Skip baselines (up to 16% lower loss on OOD languages).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Algorithms