Bag of Tricks for Inference-time Computation of LLM Reasoning

Fan Liu; Wenshuo Chao; Naiqiang Tan; Hao Liu

arXiv:2502.07191·cs.AI·February 18, 2025

Bag of Tricks for Inference-time Computation of LLM Reasoning

Fan Liu, Wenshuo Chao, Naiqiang Tan, Hao Liu

PDF

Open Access 1 Repo

TL;DR

This paper benchmarks and analyzes various inference-time strategies for improving reasoning performance in large language models, revealing effective techniques and establishing a standardized evaluation framework.

Contribution

It systematically evaluates inference-time computation methods across multiple models and tasks, highlighting overlooked strategies that significantly boost reasoning performance.

Findings

01

Tuning temperature can improve reasoning accuracy by up to 5%.

02

A standardized benchmark for inference-time methods is established.

03

Over 20,000 GPU hours were used for extensive experiments.

Abstract

With the advancement of large language models (LLMs), solving complex reasoning tasks has gained increasing attention. Inference-time computation methods (e.g., Best-of-N, beam search, et al.) are particularly valuable as they can enhance reasoning performance without modifying model parameters or requiring additional training. However, these techniques come with implementation challenges, and most existing methods remain at the proof-of-concept stage with limited practical adoption due to their computational complexity and varying effectiveness across different tasks. In this paper, we investigate and benchmark diverse inference-time computation strategies across reasoning tasks of varying complexity. Since most current methods rely on a proposer-verifier pipeline that first generates candidate solutions (e.g., reasoning solutions) and then selects the best one based on reward signals…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

usail-hkust/benchmark_inference_time_computation_LLM
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Formal Methods in Verification · Business Process Modeling and Analysis