Simplifying Knowledge Transfer in Pretrained Models
Siddharth Jain, Shyamgopal Karthik, Vineet Gandhi

TL;DR
This paper introduces a novel knowledge transfer method leveraging large model repositories, enabling pretrained models to act as teachers or students, which improves performance across various tasks like image classification, segmentation, and saliency prediction.
Contribution
It proposes a data partitioning strategy for models to autonomously switch roles as teachers or students, enhancing knowledge transfer and model performance.
Findings
Improved ViT-B performance by 1.4% in image classification.
Boosted segmentation metrics through cross-architecture knowledge transfer.
Achieved state-of-the-art results in video saliency prediction.
Abstract
Pretrained models are ubiquitous in the current deep learning landscape, offering strong results on a broad range of tasks. Recent works have shown that models differing in various design choices exhibit categorically diverse generalization behavior, resulting in one model grasping distinct data-specific insights unavailable to the other. In this paper, we propose to leverage large publicly available model repositories as an auxiliary source of model improvements. We introduce a data partitioning strategy where pretrained models autonomously adopt either the role of a student, seeking knowledge, or that of a teacher, imparting knowledge. Experiments across various tasks demonstrate the effectiveness of our proposed approach. In image classification, we improved the performance of ViT-B by approximately 1.4% through bidirectional knowledge transfer with ViT-T. For semantic segmentation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
