Revisiting Test-Time Scaling: A Survey and a Diversity-Aware Method for Efficient Reasoning

Ho-Lam Chung; Teng-Yun Hsiao; Hsiao-Ying Huang; Chunerh Cho; Jian-Ren Lin; Zhang Ziwei; Yun-Nung Chen

arXiv:2506.04611·cs.CL·June 6, 2025

Revisiting Test-Time Scaling: A Survey and a Diversity-Aware Method for Efficient Reasoning

Ho-Lam Chung, Teng-Yun Hsiao, Hsiao-Ying Huang, Chunerh Cho, Jian-Ren Lin, Zhang Ziwei, Yun-Nung Chen

PDF

Open Access

TL;DR

This paper surveys Test-Time Scaling methods for Large Language Models and introduces ADAPT, a diversity-aware prefix tuning technique that significantly enhances reasoning performance with less compute by promoting output diversity.

Contribution

It provides a structured survey of TTS methods and proposes ADAPT, a novel diversity-focused fine-tuning approach that improves reasoning accuracy efficiently.

Findings

01

ADAPT achieves 80% accuracy on reasoning tasks.

02

ADAPT uses eight times less compute than strong baselines.

03

Diversity is crucial for maximizing TTS effectiveness.

Abstract

Test-Time Scaling (TTS) improves the reasoning performance of Large Language Models (LLMs) by allocating additional compute during inference. We conduct a structured survey of TTS methods and categorize them into sampling-based, search-based, and trajectory optimization strategies. We observe that reasoning-optimized models often produce less diverse outputs, which limits TTS effectiveness. To address this, we propose ADAPT (A Diversity Aware Prefix fine-Tuning), a lightweight method that applies prefix tuning with a diversity-focused data strategy. Experiments on mathematical reasoning tasks show that ADAPT reaches 80% accuracy using eight times less compute than strong baselines. Our findings highlight the essential role of generative diversity in maximizing TTS effectiveness.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsAttentive Walk-Aggregating Graph Neural Network