Lightweight Model Pre-training via Language Guided Knowledge   Distillation

Mingsheng Li; Lin Zhang; Mingzhen Zhu; Zilong Huang; Gang Yu; Jiayuan; Fan; Tao Chen

arXiv:2406.11689·cs.CV·June 18, 2024

Lightweight Model Pre-training via Language Guided Knowledge Distillation

Mingsheng Li, Lin Zhang, Mingzhen Zhu, Zilong Huang, Gang Yu, Jiayuan, Fan, Tao Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel language-guided distillation method for pre-training small models, leveraging category names and semantic spaces to improve downstream task performance.

Contribution

It proposes a new language-guided distillation framework using semantic spaces and a text encoder, enhancing knowledge transfer for small models.

Findings

01

Achieves state-of-the-art performance on downstream tasks

02

Outperforms models pre-trained with ImageNet or self-supervised methods

03

Validates effectiveness across classification, detection, and segmentation

Abstract

This paper studies the problem of pre-training for small models, which is essential for many mobile devices. Current state-of-the-art methods on this problem transfer the representational knowledge of a large network (as a Teacher) into a smaller model (as a Student) using self-supervised distillation, improving the performance of the small model on downstream tasks. However, existing approaches are insufficient in extracting the crucial knowledge that is useful for discerning categories in downstream tasks during the distillation process. In this paper, for the first time, we introduce language guidance to the distillation process and propose a new method named Language-Guided Distillation (LGD) system, which uses category names of the target downstream task to help refine the knowledge transferred between the teacher and student. To this end, we utilize a pre-trained text encoder to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mzhenz/lgd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Data Processing Techniques