Feature-informed Embedding Space Regularization For Audio Classification

Yun-Ning Hung; Alexander Lerch

arXiv:2206.04850·eess.AS·June 13, 2022

Feature-informed Embedding Space Regularization For Audio Classification

Yun-Ning Hung, Alexander Lerch

PDF

Open Access

TL;DR

This paper introduces two regularization methods that combine task-specific and pre-trained features to improve audio classification performance while reducing inference complexity.

Contribution

It proposes novel regularization techniques that leverage both detailed task-specific and generic pre-trained features, enhancing audio classification accuracy.

Findings

01

Proposed methods outperform baseline models.

02

Using combined features yields better results than individual features.

03

Improved state-of-the-art performance on multiple audio tasks.

Abstract

Feature representations derived from models pre-trained on large-scale datasets have shown their generalizability on a variety of audio analysis tasks. Despite this generalizability, however, task-specific features can outperform if sufficient training data is available, as specific task-relevant properties can be learned. Furthermore, the complex pre-trained models bring considerable computational burdens during inference. We propose to leverage both detailed task-specific features from spectrogram input and generic pre-trained features by introducing two regularization methods that integrate the information of both feature classes. The workload is kept low during inference as the pre-trained features are only necessary for training. In experiments with the pre-trained features VGGish, OpenL3, and a combination of both, we show that the proposed methods not only outperform baseline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Diverse Musicological Studies