Knowledge Transfer from Weakly Labeled Audio using Convolutional Neural   Network for Sound Events and Scenes

Anurag Kumar; Maksim Khadkevich; Christian Fugen

arXiv:1711.01369·cs.SD·September 10, 2018

Knowledge Transfer from Weakly Labeled Audio using Convolutional Neural Network for Sound Events and Scenes

Anurag Kumar, Maksim Khadkevich, Christian Fugen

PDF

1 Repo 2 Datasets

TL;DR

This paper introduces a CNN-based framework for sound event detection and scene classification using weakly labeled web audio data, achieving state-of-the-art results and effective transfer learning.

Contribution

It presents novel methods for transfer learning from weakly labeled audio, enabling effective domain and task adaptation with a CNN model trained on variable-length audio.

Findings

01

Achieved human-level accuracy on ESC-50 dataset.

02

Set new state-of-the-art results on Audioset.

03

Demonstrated effective semantic representation learning.

Abstract

In this work we propose approaches to effectively transfer knowledge from weakly labeled web audio data. We first describe a convolutional neural network (CNN) based framework for sound event detection and classification using weakly labeled audio data. Our model trains efficiently from audios of variable lengths; hence, it is well suited for transfer learning. We then propose methods to learn representations using this model which can be effectively used for solving the target task. We study both transductive and inductive transfer learning tasks, showing the effectiveness of our methods for both domain and task adaptation. We show that the learned representations using the proposed CNN model generalizes well enough to reach human level accuracy on ESC-50 sound events dataset and set state of art results on this dataset. We further use them for acoustic scene classification task and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

michalmar/audio_classifier
pytorch

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.