Single-Teacher View Augmentation: Boosting Knowledge Distillation via Angular Diversity

Seonghoon Yu; Dongjun Nam; Dina Katabi; Jeany Son

arXiv:2510.22480·cs.CV·October 28, 2025

Single-Teacher View Augmentation: Boosting Knowledge Distillation via Angular Diversity

Seonghoon Yu, Dongjun Nam, Dina Katabi, Jeany Son

PDF

TL;DR

This paper introduces a cost-effective method to enhance knowledge distillation by creating diverse teacher views through angular diversity objectives, improving student performance without multiple teachers.

Contribution

The paper proposes a novel single-teacher multi-view augmentation technique using angular diversity objectives, reducing computational costs while boosting distillation effectiveness.

Findings

01

Outperforms existing knowledge augmentation methods.

02

Compatible with various KD frameworks, improving generalization.

03

Theoretically reduces ensemble loss upper bound.

Abstract

Knowledge Distillation (KD) aims to train a lightweight student model by transferring knowledge from a large, high-capacity teacher. Recent studies have shown that leveraging diverse teacher perspectives can significantly improve distillation performance; however, achieving such diversity typically requires multiple teacher networks, leading to high computational costs. In this work, we propose a novel cost-efficient knowledge augmentation method for KD that generates diverse multi-views by attaching multiple branches to a single teacher. To ensure meaningful semantic variation across multi-views, we introduce two angular diversity objectives: 1) constrained inter-angle diversify loss, which maximizes angles between augmented views while preserving proximity to the original teacher output, and 2) intra-angle diversify loss, which encourages an even distribution of views around the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.