A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
Ting-Ruen Wei, Haowei Liu, Xuyang Wu, Yi Fang

TL;DR
This survey reviews various feedback-based multi-step reasoning techniques for large language models in mathematics, highlighting methods that improve reasoning accuracy through training-based, training-free, and outcome feedback strategies.
Contribution
It provides a comprehensive overview of existing feedback mechanisms in multi-step math reasoning for LLMs, establishing a foundation for future research.
Findings
Feedback strategies improve reasoning accuracy
Training-free techniques leverage external tools
Outcome rewards offer cost-effective alternatives
Abstract
Recent progress in large language models (LLM) found chain-of-thought prompting strategies to improve the reasoning ability of LLMs by encouraging problem solving through multiple steps. Therefore, subsequent research aimed to integrate the multi-step reasoning process into the LLM itself through process rewards as feedback and achieved improvements over prompting strategies. Due to the cost of step-level annotation, some turn to outcome rewards as feedback. Aside from these training-based approaches, training-free techniques leverage frozen LLMs or external tools for feedback at each step to enhance the reasoning process. With the abundance of work in mathematics due to its logical nature, we present a survey of strategies utilizing feedback at the step and outcome levels to enhance multi-step math reasoning for LLMs. As multi-step reasoning emerges a crucial component in scaling LLMs,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Intelligent Tutoring Systems and Adaptive Learning · Natural Language Processing Techniques
