Do sound event representations generalize to other audio tasks? A case   study in audio transfer learning

Anurag Kumar; Yun Wang; Vamsi Krishna Ithapu; Christian Fuegen

arXiv:2106.11335·cs.SD·June 23, 2021

Do sound event representations generalize to other audio tasks? A case study in audio transfer learning

Anurag Kumar, Yun Wang, Vamsi Krishna Ithapu, Christian Fuegen

PDF

TL;DR

This paper investigates whether audio representations learned from sound event detection can effectively transfer to other audio tasks using simple linear classifiers, demonstrating strong transferability and providing insights into their attributes.

Contribution

It demonstrates that neural network-based sound event representations generalize well across diverse audio tasks through simple transfer methods, offering new understanding of their attributes.

Findings

01

High performance transfer with simple linear classifiers

02

Sound event representations capture transferable audio features

03

Insights into attributes enabling efficient transfer

Abstract

Transfer learning is critical for efficient information transfer across multiple related learning problems. A simple, yet effective transfer learning approach utilizes deep neural networks trained on a large-scale task for feature extraction. Such representations are then used to learn related downstream tasks. In this paper, we investigate transfer learning capacity of audio representations obtained from neural networks trained on a large-scale sound event detection dataset. We build and evaluate these representations across a wide range of other audio tasks, via a simple linear classifier transfer mechanism. We show that such simple linear transfer is already powerful enough to achieve high performance on the downstream tasks. We also provide insights into the attributes of sound event representations that enable such efficient information transfer.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.