DistillLens: Symmetric Knowledge Distillation Through Logit Lens
Manish Dhakal, Uthman Jinadu, Anjila Budathoki, Rajshekhar Sunderraman, Yi Ding

TL;DR
DistillLens introduces a symmetric knowledge distillation framework that aligns student and teacher models' intermediate states using a Logit Lens, improving performance on instruction-following tasks over traditional methods.
Contribution
It proposes a novel symmetric alignment method via Logit Lens, addressing limitations of existing feature-based distillation techniques.
Findings
Outperforms standard KD and feature-transfer baselines
Effective on GPT-2 and Llama architectures
Enhances instruction-following benchmark results
Abstract
Standard Knowledge Distillation (KD) compresses Large Language Models (LLMs) by optimizing final outputs, yet it typically treats the teacher's intermediate layer's thought process as a black box. While feature-based distillation attempts to bridge this gap, existing methods (e.g., MSE and asymmetric KL divergence) ignore the rich uncertainty profiles required for the final output. In this paper, we introduce DistillLens, a framework that symmetrically aligns the evolving thought processes of student and teacher models. By projecting intermediate hidden states into the vocabulary space via the Logit Lens, we enforce structural alignment using a symmetric divergence objective. Our analysis proves that this constraint imposes a dual-sided penalty, preventing both overconfidence and underconfidence while preserving the high-entropy information conduits essential for final deduction.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Topic Modeling · Multimodal Machine Learning Applications
