Quantification of Large Language Model Distillation

Sunbowen Lee; Junting Zhou; Chang Ao; Kaige Li; Xinrun Du; Sirui He,; Haihong Wu; Tianci Liu; Jiaheng Liu; Hamid Alinejad-Rokny; Min Yang; Yitao; Liang; Zhoufutu Wen; Shiwen Ni

arXiv:2501.12619·cs.CL·February 18, 2025

Quantification of Large Language Model Distillation

Sunbowen Lee, Junting Zhou, Chang Ao, Kaige Li, Xinrun Du, Sirui He,, Haihong Wu, Tianci Liu, Jiaheng Liu, Hamid Alinejad-Rokny, Min Yang, Yitao, Liang, Zhoufutu Wen, Shiwen Ni

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a framework to systematically evaluate and quantify the extent of knowledge transfer and homogenization in large language model distillation, highlighting differences among models and emphasizing transparency for robustness.

Contribution

It proposes a novel method to measure distillation effects in LLMs, focusing on identity perception and response similarity, enhancing understanding of model homogenization.

Findings

01

High distillation degrees in most LLMs except Claude, Doubao, and Gemini.

02

Base LLMs exhibit higher distillation degrees than aligned LLMs.

03

The framework improves transparency and understanding of LLM data distillation processes.

Abstract

Model distillation is a fundamental technique in building large language models (LLMs), transferring knowledge from a teacher model to a student model. However, distillation can lead to model homogenization, reducing diversity among models and impairing their ability to robustly handle complex or novel tasks. These limitations underscore the need to systematically quantify the distillation process and its impact. In this work, we propose a framework to evaluate and quantify model distillation. Our method addresses two key aspects: (1) Identifying identity cognition contradictions to assess discrepancies in how models perceive and represent identity-related information, and (2) Analyzing multi-granularity response similarities across models to measure the extent of homogenization. Experimental results demonstrate two key insights: (1) Well-known closed-source and open-source LLMs usually…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aegis1863/llms-distillation-quantification
noneOfficial

Videos

Quantification of Large Language Model Distillation· underline

Taxonomy

TopicsTopic Modeling

MethodsBalanced Selection