Audio Embeddings as Teachers for Music Classification
Yiwei Ding, Alexander Lerch

TL;DR
This paper explores using pre-trained audio embeddings as teachers to improve low-complexity music classification models, achieving better performance through feature regularization and knowledge transfer.
Contribution
It introduces a novel method of using pre-trained audio embeddings for feature-based knowledge distillation in music classification tasks.
Findings
Significant performance improvements over non-teacher models.
Effective transfer of knowledge from pre-trained embeddings.
Compatibility with classical knowledge distillation methods.
Abstract
Music classification has been one of the most popular tasks in the field of music information retrieval. With the development of deep learning models, the last decade has seen impressive improvements in a wide range of classification tasks. However, the increasing model complexity makes both training and inference computationally expensive. In this paper, we integrate the ideas of transfer learning and feature-based knowledge distillation and systematically investigate using pre-trained audio embeddings as teachers to guide the training of low-complexity student networks. By regularizing the feature space of the student networks with the pre-trained embeddings, the knowledge in the teacher embeddings can be transferred to the students. We use various pre-trained audio embeddings and test the effectiveness of the method on the tasks of musical instrument classification and music…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Diverse Musicological Studies · Music Technology and Sound Studies
MethodsKnowledge Distillation
