Beyond Imitation: Learning Key Reasoning Steps from Dual   Chain-of-Thoughts in Reasoning Distillation

Chengwei Dai; Kun Li; Wei Zhou; Songlin Hu

arXiv:2405.19737·cs.CL·May 31, 2024

Beyond Imitation: Learning Key Reasoning Steps from Dual Chain-of-Thoughts in Reasoning Distillation

Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu

PDF

Open Access 1 Repo

TL;DR

This paper introduces EDIT, a novel distillation method that improves small language models' ability to learn key reasoning steps from large models by analyzing dual reasoning paths with divergent conclusions.

Contribution

We propose a mistake-driven distillation approach that identifies and emphasizes crucial reasoning steps, enhancing the learning of key reasoning processes in smaller models.

Findings

01

EDIT improves key reasoning step learning in SLMs

02

Dual CoTs reveal crucial reasoning steps with divergent conclusions

03

EDIT outperforms traditional fine-tuning on reasoning benchmarks

Abstract

As Large Language Models (LLMs) scale up and gain powerful Chain-of-Thoughts (CoTs) reasoning abilities, practical resource constraints drive efforts to distill these capabilities into more compact Smaller Language Models (SLMs). We find that CoTs consist mainly of simple reasoning forms, with a small proportion ( $\approx 4.7%$ ) of key reasoning steps that truly impact conclusions. However, previous distillation methods typically involve supervised fine-tuning student SLMs only on correct CoTs data produced by teacher LLMs, resulting in students struggling to learn the key reasoning steps, instead imitating the teacher's reasoning forms and making errors or omissions on these steps. To address these issues, drawing an analogy to human learning, where analyzing mistakes according to correct solutions often reveals the crucial steps leading to successes or failures, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

c-w-d/edit
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · AI-based Problem Solving and Planning