SCOTT: Self-Consistent Chain-of-Thought Distillation

Peifeng Wang; Zhengyang Wang; Zheng Li; Yifan Gao; Bing Yin; Xiang; Ren

arXiv:2305.01879·cs.CL·September 1, 2023·1 cites

SCOTT: Self-Consistent Chain-of-Thought Distillation

Peifeng Wang, Zhengyang Wang, Zheng Li, Yifan Gao, Bing Yin, Xiang, Ren

PDF

Open Access 1 Repo

TL;DR

This paper introduces SCOTT, a method for distilling large language models into smaller, self-consistent models that generate more faithful rationales for their predictions, improving interpretability without sacrificing performance.

Contribution

The paper presents a novel knowledge distillation approach that ensures the smaller model produces consistent and faithful chain-of-thought rationales aligned with the teacher model.

Findings

01

Smaller models can generate more faithful rationales using SCOTT.

02

SCOTT achieves comparable task performance to larger models.

03

The method enhances the interpretability and trustworthiness of LLMs.

Abstract

Large language models (LMs) beyond a certain scale, demonstrate the emergent capability of generating free-text rationales for their predictions via chain-of-thought (CoT) prompting. While CoT can yield dramatically improved performance, such gains are only observed for sufficiently large LMs. Even more concerning, there is little guarantee that the generated rationales are consistent with LM's predictions or faithfully justify the decisions. In this work, we propose a faithful knowledge distillation method to learn a small, self-consistent CoT model from a teacher model that is orders of magnitude larger. To form better supervision, we elicit rationales supporting the gold answers from a large LM (teacher) by contrastive decoding, which encourages the teacher to generate tokens that become more plausible only when the answer is considered. To ensure faithful distillation, we use the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wangpf3/consistent-cot-distillation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Materials Science

MethodsKnowledge Distillation