Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models
Amir Azarmehr, Soheil Behnezhad, Alma Ghafari

TL;DR
This paper models test-time computation for large language models as a Markov chain interaction, characterizes the optimal algorithm as a caterpillar tree, and introduces CaT, a new method that improves success rate and efficiency.
Contribution
It provides a theoretical framework for optimal test-time algorithms, characterizes the structure as a caterpillar tree, and proposes CaT, a novel algorithm that outperforms existing methods.
Findings
CaT achieves higher success rates than ToT.
CaT reduces the number of token generations.
The optimal test-time algorithm forms a caterpillar tree structure.
Abstract
Large language models (LLMs) can often produce substantially better outputs when allowed to use additional test-time computation, such as sampling, chain of thought, backtracking, or revising partial solutions. Despite the growing empirical success of such techniques, there is limited theoretical understanding of how inference time computation should be structured, or what constitutes an optimal use of a fixed computation budget. We model test-time computation as an algorithm interacting with a Markov chain: at any point, the algorithm may resume generation from any previously observed state. That is, unlike standard Markov chains where the states are drawn passively, we allow the algorithm to backtrack to any previously observed state of the Markov chain at any time. Many of the existing test-time algorithms, such as Chain-of-Thought (CoT) (Wei et al., 2023), Tree-of-Thoughts (ToT)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Algorithms
