Rethinking the Unsolvable: When In-Context Search Meets Test-Time Scaling
Fanzeng Xia, Yidong Luo, Tinko Sebastian Bartels, Yaqi Xu, and Tongxin Li

TL;DR
This paper demonstrates that combining in-context search prompting with test-time scaling significantly enhances LLM reasoning performance on complex, previously unsolvable problems, challenging existing evaluation paradigms.
Contribution
It introduces a novel approach that leverages advanced in-context search and internal scaling to substantially improve LLM reasoning capabilities on hard tasks.
Findings
Achieves up to 30x success rate improvement on complex tasks
Empirically validates the approach on NP-hard and real-world benchmarks
Theoretically extends the class of solvable reasoning problems
Abstract
Recent research has highlighted that Large Language Models (LLMs), even when trained to generate extended long reasoning steps, still face significant challenges on hard reasoning problems. However, much of the existing literature relies on direct prompting with simple in-context learning examples for evaluation, which largely overlooks advanced techniques to elicit LLMs' deliberate reasoning before drawing conclusions that LLMs hit a performance ceiling. In this paper, we systematically explore the combined potential of in-context search and test-time scaling on super hard reasoning tasks. We find that by employing advanced in-context search prompting to LLMs augmented with internal scaling, one can achieve transformative performance breakthroughs on tasks previously deemed "unsolvable" (e.g., reported success rates below 5%). We provide both empirical results and theoretical analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · AI-based Problem Solving and Planning
