No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function
Haotian Xu

TL;DR
This paper introduces a novel method combining Monte Carlo Tree Search and an energy function to enhance the mathematical reasoning abilities of large language models without additional fine-tuning, achieving significant performance improvements.
Contribution
It presents a Residual-based Energy Model and MCTS-guided reasoning approach that improves LLMs' mathematical reasoning without extra fine-tuning or reinforcement learning.
Findings
Significant improvement in pass@1 metric on GSM8k and AQUA-RAT benchmarks.
Effective use of energy function as a path verifier in reasoning tasks.
No additional fine-tuning or reinforcement learning required.
Abstract
Large language models (LLMs) demonstrate impressive language understanding and contextual learning abilities, making them suitable for natural language processing (NLP) tasks and complex mathematical reasoning. However, when applied to mathematical reasoning tasks, LLMs often struggle to generate correct reasoning steps and answers despite having high probabilities for the solutions. To overcome this limitation and enhance the mathematical reasoning capabilities of fine-tuned LLMs without additional fine-tuning steps, we propose a method that incorporates Monte Carlo Tree Search (MCTS) and a lightweight energy function to rank decision steps and enable immediate reaction and precise reasoning. Specifically, we re-formulate the fine-tuned LLMs into a Residual-based Energy Model (Residual-EBM) and employ noise contrastive estimation to estimate the energy function's parameters. We then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
