Advantage-Guided Distillation for Preference Alignment in Small Language Models
Shiping Gao, Fanqi Wan, Jiajian Guo, Xiaojun Quan, Qifan Wang

TL;DR
This paper introduces a novel approach to improve the alignment of Small Language Models by leveraging a well-aligned teacher model through advantage-guided distillation, significantly narrowing the performance gap with larger models.
Contribution
It proposes Advantage-Guided Distillation for Preference Alignment (ADPA), a new method that transfers human preference knowledge from a large teacher LLM to small models, enhancing their alignment.
Findings
ADPA outperforms existing methods in aligning SLMs with human preferences.
Combining ADPA with DCKD yields even better alignment results.
The approaches significantly reduce the performance gap between small and large language models.
Abstract
Alignment techniques enable Large Language Models (LLMs) to generate outputs that align with human preferences and play a crucial role in their effectiveness. However, their impact often diminishes when applied to Small Language Models (SLMs), likely due to the limited capacity of these models. Instead of directly applying existing alignment techniques to SLMs, we propose to utilize a well-aligned teacher LLM to guide the alignment process for these models, thereby facilitating the transfer of the teacher's knowledge of human preferences to the student model. To achieve this, we first explore a straightforward approach, Dual-Constrained Knowledge Distillation (DCKD), that employs knowledge distillation with two KL-divergence constraints from the aligned teacher to the unaligned student. To further enhance the student's ability to distinguish between preferred and dispreferred responses,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques
MethodsKnowledge Distillation · ALIGN
