TL;DR
This paper introduces Solution Guidance Fine-Tuning (SGFT), a novel approach that enhances small language models' reasoning abilities by training them to generate problem-solving guidance, improving performance on reasoning tasks with minimal data.
Contribution
The paper proposes a new reasoning strategy and a plug-and-play fine-tuning paradigm that significantly boosts small language models' reasoning capabilities using limited training data.
Findings
SGFT improves reasoning accuracy of small models
Method enables flexible, prompt-based problem solving
Significant performance gains on reasoning benchmarks
Abstract
Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks. Advances in prompt engineering and fine-tuning techniques have further enhanced their ability to address complex reasoning challenges. However, these advanced capabilities are often exclusive to models exceeding 100 billion parameters. Although Chain-of-Thought (CoT) fine-tuning methods have been explored for smaller models (under 10 billion parameters), they typically depend on extensive CoT training data, which can introduce inconsistencies and limit effectiveness in low-data settings. To overcome these limitations, this paper introduce a new reasoning strategy Solution Guidance (SG) and a plug-and-play training paradigm Solution-Guidance Fine-Tuning (SGFT) for enhancing the reasoning capabilities of small language models. SG focuses on problem understanding and decomposition at the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
