FiGKD: Fine-Grained Knowledge Distillation via High-Frequency Detail Transfer
Seonghak Kim

TL;DR
FiGKD introduces a frequency-aware knowledge distillation method that selectively transfers high-frequency details from teacher to student, significantly improving fine-grained visual recognition performance.
Contribution
The paper proposes a novel frequency decomposition approach using DWT to enhance knowledge distillation by focusing on high-frequency details, which is architecture-agnostic and does not require intermediate features.
Findings
Outperforms state-of-the-art distillation methods on multiple benchmarks.
Effectively captures subtle decision boundaries in fine-grained tasks.
Improves student model accuracy with minimal additional computation.
Abstract
Knowledge distillation (KD) is a widely adopted technique for transferring knowledge from a high-capacity teacher model to a smaller student model by aligning their output distributions. However, existing methods often underperform in fine-grained visual recognition tasks, where distinguishing subtle differences between visually similar classes is essential. This performance gap stems from the fact that conventional approaches treat the teacher's output logits as a single, undifferentiated signal-assuming all contained information is equally beneficial to the student. Consequently, student models may become overloaded with redundant signals and fail to capture the teacher's nuanced decision boundaries. To address this issue, we propose Fine-Grained Knowledge Distillation (FiGKD), a novel frequency-aware framework that decomposes a model's logits into low-frequency (content) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation
