MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution
Yuxuan Jiang, Chen Feng, Fan Zhang, and David Bull

TL;DR
This paper introduces MTKD, a multi-teacher knowledge distillation framework for image super-resolution, utilizing multiple teachers and a wavelet-based loss to improve the performance of compact models.
Contribution
It proposes a novel multi-teacher distillation approach with a wavelet-based loss tailored for super-resolution, outperforming existing single-teacher KD methods.
Findings
Achieves up to 0.46dB PSNR improvement over state-of-the-art KD methods.
Effective across multiple network architectures.
Demonstrates significant super-resolution performance gains.
Abstract
Knowledge distillation (KD) has emerged as a promising technique in deep learning, typically employed to enhance a compact student network through learning from their high-performance but more complex teacher variant. When applied in the context of image super-resolution, most KD approaches are modified versions of methods developed for other computer vision tasks, which are based on training strategies with a single teacher and simple loss functions. In this paper, we propose a novel Multi-Teacher Knowledge Distillation (MTKD) framework specifically for image super-resolution. It exploits the advantages of multiple teachers by combining and enhancing the outputs of these teacher models, which then guides the learning process of the compact student network. To achieve more effective learning performance, we have also developed a new wavelet-based loss function for MTKD, which can better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques
MethodsKnowledge Distillation
