Keypoint-based Progressive Chain-of-Thought Distillation for LLMs
Kaituo Feng, Changsheng Li, Xiaolu Zhang, Jun Zhou, Ye Yuan, Guoren, Wang

TL;DR
This paper introduces KPOD, a novel distillation framework for LLMs that emphasizes keypoint tokens and progressive learning to enhance reasoning transfer to smaller models.
Contribution
The paper proposes a unified approach with token weighting and progressive distillation strategies to improve reasoning accuracy in student models.
Findings
KPOD significantly outperforms previous distillation methods on four reasoning benchmarks.
Token weighting effectively emphasizes key reasoning tokens during training.
Progressive distillation aligns better with human-like learning order, improving reasoning skills.
Abstract
Chain-of-thought distillation is a powerful technique for transferring reasoning abilities from large language models (LLMs) to smaller student models. Previous methods typically require the student to mimic the step-by-step rationale produced by LLMs, often facing the following challenges: (i) Tokens within a rationale vary in significance, and treating them equally may fail to accurately mimic keypoint tokens, leading to reasoning errors. (ii) They usually distill knowledge by consistently predicting all the steps in a rationale, which falls short in distinguishing the learning order of step generation. This diverges from the human cognitive progression of starting with easy tasks and advancing to harder ones, resulting in sub-optimal outcomes. To this end, we propose a unified framework, called KPOD, to address these issues. Specifically, we propose a token weighting module utilizing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Cloud Computing and Resource Management
