Can Large Models Teach Student Models to Solve Mathematical Problems Like Human Beings? A Reasoning Distillation Method via Multi-LoRA Interaction
Xinhe Li, Jiajun Liu, Peng Wang

TL;DR
This paper introduces LoRID, a novel reasoning distillation method inspired by human dual-process thinking, which significantly improves small language models' mathematical reasoning by leveraging knowledge-enhanced datasets and mutual feedback mechanisms.
Contribution
It proposes a multi-LoRA interaction framework that imitates human System 1 and System 2 thinking to enhance reasoning in small language models, achieving state-of-the-art results.
Findings
LoRID outperforms previous methods on GSM8K dataset.
It improves accuracy across multiple base models.
Mutual feedback enhances reasoning consistency.
Abstract
Recent studies have demonstrated that Large Language Models (LLMs) have strong mathematical reasoning abilities but rely on hundreds of billions of parameters. To tackle the challenge of poor reasoning in Small Language Models (SLMs), existing methods typically leverage LLMs to generate massive amounts of data for cramming training. In psychology, they are akin to System 1 thinking, which resolves reasoning problems rapidly based on experience and intuition. However, human learning also requires System 2 thinking, where knowledge is first acquired and then reinforced through practice. Inspired by such two distinct modes of thinking, we propose a novel method based on the multi-LoRA Interaction for mathematical reasoning Distillation (LoRID). First, we input the question and reasoning of each sample into an LLM to create knowledge-enhanced datasets. Subsequently, we train a LoRA block on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗LoRID-Math/GSM8K-Mistral-7B-IRmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗LoRID-Math/GSM8K-Mistral-7B-KGmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗LoRID-Math/GSM8K-Mistral-7B-DRmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗LoRID-Math/GSM8K-LLaMA-2-7B-IRmodel· 5 dl· ♡ 15 dl♡ 1
- 🤗LoRID-Math/GSM8K-LLaMA-2-7B-KGmodel· 4 dl· ♡ 14 dl♡ 1
- 🤗LoRID-Math/GSM8K-LLaMA-2-7B-DRmodel· 4 dl· ♡ 14 dl♡ 1
- 🤗LoRID-Math/MATH-LLaMA-2-7B-IRmodel· 5 dl· ♡ 15 dl♡ 1
- 🤗LoRID-Math/MATH-LLaMA-2-7B-KGmodel· 4 dl· ♡ 14 dl♡ 1
- 🤗LoRID-Math/MATH-LLaMA-2-7B-DRmodel· 4 dl· ♡ 14 dl♡ 1
- 🤗LoRID-Math/MATH-Mistral-7B-IRmodel· 2 dl· ♡ 12 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
