Mode-Conditioning Unlocks Superior Test-Time Scaling

Chen Henry Wu; Sachin Goyal; Aditi Raghunathan

arXiv:2512.01127·cs.LG·December 2, 2025

Mode-Conditioning Unlocks Superior Test-Time Scaling

Chen Henry Wu, Sachin Goyal, Aditi Raghunathan

PDF

Open Access

TL;DR

Mode-conditioning (ModC) enhances test-time scaling by explicitly managing reasoning modes, improving efficiency and diversity in large-scale reasoning tasks and reinforcement learning.

Contribution

The paper introduces ModC, a novel framework that allocates test-time compute across reasoning modes, significantly improving scaling and diversity without requiring explicit mode labels.

Findings

01

ModC improves scaling across various tasks and models.

02

Fine-tuning with ModC yields 4x efficiency gains.

03

Gradient clustering enables ModC without explicit mode labels.

Abstract

Parallel sampling promises substantial gains in test-time scaling, but its effectiveness is sharply limited by diversity collapse, where models concentrate on a few modes and repeated samples produce the same mistakes. We propose the mode-conditioning (ModC) framework, which explicitly allocates test-time compute across reasoning modes using either specialist models or mode-specific prefixes. ModC consistently improves scaling across controlled graph-search tasks and large-scale reasoning benchmarks, spanning model families and sizes from 0.5B to 7B. On OpenThoughts, fine-tuning Qwen2.5-7B with ModC achieves a 4x efficiency gain over standard training while also improving the maximum attainable Pass@k. We further show that gradient clustering enables ModC without explicit mode labels, yielding up to 10% gains on datasets such as NuminaMath. Finally, we show that ModC improves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Advanced Graph Neural Networks