TL;DR
AutoTTS introduces an environment-driven framework for automatic test-time scaling strategy discovery in large language models, improving performance and efficiency over manual heuristics.
Contribution
It shifts TTS strategy design from manual heuristics to automated environment-based discovery, enabling scalable and generalizable solutions.
Findings
Discovered strategies improve accuracy-cost tradeoff on benchmarks.
Strategies generalize across benchmarks and model scales.
Discovery process is efficient, costing only 39.9 and 160 minutes.
Abstract
Test-time scaling (TTS) has become an effective approach for improving large language model performance by allocating additional computation during inference. However, existing TTS strategies are largely hand-crafted: researchers manually design reasoning patterns and tune heuristics by intuition, leaving much of the computation-allocation space unexplored. We propose an environment-driven framework, AutoTTS, that changes what researchers design: from individual TTS heuristics to environments where TTS strategies can be discovered automatically. The key to AutoTTS lies in environment construction: the discovery environment must make the control space tractable and provide cheap, frequent feedback for TTS search. As a concrete instantiation, we formulate width--depth TTS as controller synthesis over pre-collected reasoning trajectories and probe signals, where controllers decide when to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
