Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory

Yexiang Liu; Zekun Li; Zhi Fang; Nan Xu; Ran He; Tieniu Tan

arXiv:2505.10981·cs.AI·August 18, 2025

Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory

Yexiang Liu, Zekun Li, Zhi Fang, Nan Xu, Ran He, Tieniu Tan

PDF

1 Repo

TL;DR

This paper investigates how different prompting strategies perform as test-time compute scales in large language models, revealing that simple methods often outperform complex ones at larger sampling times, supported by theoretical analysis and a probabilistic prediction approach.

Contribution

It provides a systematic experimental analysis, theoretical insights, and a probabilistic method to predict and improve prompting strategy performance during test-time scaling of LLMs.

Findings

01

Simple prompting strategies outperform complex ones at larger sampling times.

02

Theoretical proofs explain the observed performance trends.

03

A probabilistic method accurately predicts optimal prompting strategies.

Abstract

Recently, scaling test-time compute on Large Language Models (LLM) has garnered wide attention. However, there has been limited investigation of how various reasoning prompting strategies perform as scaling. In this paper, we focus on a standard and realistic scaling setting: majority voting. We systematically conduct experiments on 6 LLMs $\times$ 8 prompting strategies $\times$ 6 benchmarks. Experiment results consistently show that as the sampling time and computational overhead increase, complicated prompting strategies with superior initial performance gradually fall behind simple Chain-of-Thought. We analyze this phenomenon and provide theoretical proofs. Additionally, we propose a probabilistic method to efficiently predict scaling performance and identify the best prompting strategy under large sampling times, eliminating the need for resource-intensive inference processes in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MraDonkey/rethinking_prompting
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus