Keypoint-based Progressive Chain-of-Thought Distillation for LLMs

Kaituo Feng; Changsheng Li; Xiaolu Zhang; Jun Zhou; Ye Yuan; Guoren; Wang

arXiv:2405.16064·cs.CL·May 28, 2024

Keypoint-based Progressive Chain-of-Thought Distillation for LLMs

Kaituo Feng, Changsheng Li, Xiaolu Zhang, Jun Zhou, Ye Yuan, Guoren, Wang

PDF

Open Access

TL;DR

This paper introduces KPOD, a novel distillation framework for LLMs that emphasizes keypoint tokens and progressive learning to enhance reasoning transfer to smaller models.

Contribution

The paper proposes a unified approach with token weighting and progressive distillation strategies to improve reasoning accuracy in student models.

Findings

01

KPOD significantly outperforms previous distillation methods on four reasoning benchmarks.

02

Token weighting effectively emphasizes key reasoning tokens during training.

03

Progressive distillation aligns better with human-like learning order, improving reasoning skills.

Abstract

Chain-of-thought distillation is a powerful technique for transferring reasoning abilities from large language models (LLMs) to smaller student models. Previous methods typically require the student to mimic the step-by-step rationale produced by LLMs, often facing the following challenges: (i) Tokens within a rationale vary in significance, and treating them equally may fail to accurately mimic keypoint tokens, leading to reasoning errors. (ii) They usually distill knowledge by consistently predicting all the steps in a rationale, which falls short in distinguishing the learning order of step generation. This diverges from the human cognitive progression of starting with easy tasks and advancing to harder ones, resulting in sub-optimal outcomes. To this end, we propose a unified framework, called KPOD, to address these issues. Specifically, we propose a token weighting module utilizing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Cloud Computing and Resource Management