Self-Distillation as a Performance Recovery Mechanism for LLMs: Counteracting Compression and Catastrophic Forgetting
Chi Liu, Xin Chen, Xu Zhou, Fangbo Tu, Srinivasan Manoharan

TL;DR
This paper introduces a self-distillation fine-tuning method to recover LLM performance degraded by compression and forgetting, supported by a theoretical framework based on high-dimensional manifold alignment.
Contribution
It proposes a novel self-distillation framework for performance recovery in LLMs and provides a theoretical explanation using high-dimensional manifold alignment.
Findings
Self-distillation effectively restores model capabilities after degradation.
Performance recovery correlates strongly with high-dimensional manifold alignment.
CKA metrics reveal alignment between student and teacher activation trajectories.
Abstract
Large Language Models (LLMs) have achieved remarkable success, underpinning diverse AI applications. However, they often suffer from performance degradation due to factors such as catastrophic forgetting during Supervised Fine-Tuning (SFT), quantization, and pruning. In this work, we introduce a performance recovery framework based on Self-Distillation Fine-Tuning (SDFT) that effectively restores model capabilities. Complementing this practical contribution, we provide a rigorous theoretical explanation for the underlying recovery mechanism. We posit that an LLM's generative capability fundamentally relies on the high-dimensional manifold constructed by its hidden layers. To investigate this, we employ Centered Kernel Alignment (CKA) to quantify the alignment between student and teacher activation trajectories, leveraging its invariance to orthogonal transformations and scaling. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
