Recurrent Neural Networks for Polyphonic Sound Event Detection in Real   Life Recordings

Giambattista Parascandolo; Heikki Huttunen; Tuomas Virtanen

arXiv:1604.00861·cs.SD·November 17, 2016

Recurrent Neural Networks for Polyphonic Sound Event Detection in Real Life Recordings

Giambattista Parascandolo, Heikki Huttunen, Tuomas Virtanen

PDF

2 Repos

TL;DR

This paper introduces a BLSTM-based approach for detecting multiple overlapping sound events in real-world recordings, achieving significant improvements over previous methods in accuracy.

Contribution

The paper presents a novel single multilabel BLSTM RNN model for polyphonic sound event detection in real-life recordings, outperforming prior approaches.

Findings

01

Achieved an average F1-score of 65.5% on 1-second blocks.

02

Improved detection accuracy using data augmentation techniques.

03

Outperformed previous state-of-the-art methods by 6.8% and 15.1%.

Abstract

In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs). A single multilabel BLSTM RNN is trained to map acoustic features of a mixture signal consisting of sounds from multiple classes, to binary activity indicators of each event class. Our method is tested on a large database of real-life recordings, with 61 classes (e.g. music, car, speech) from 10 different everyday contexts. The proposed method outperforms previous approaches by a large margin, and the results are further improved using data augmentation techniques. Overall, our system reports an average F1-score of 65.5% on 1 second blocks and 64.7% on single frames, a relative improvement over previous state-of-the-art approach of 6.8% and 15.1% respectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.