A report on sound event detection with different binaural features

Sharath Adavanne; Tuomas Virtanen

arXiv:1710.02997·cs.SD·October 10, 2017·60 cites

A report on sound event detection with different binaural features

Sharath Adavanne, Tuomas Virtanen

PDF

Open Access

TL;DR

This paper compares binaural audio features to single-channel features for sound event detection, demonstrating that binaural features often outperform single-channel features in error rate metrics.

Contribution

It introduces a comparative analysis of three binaural features for sound event detection using neural networks on a standard dataset.

Findings

01

Binaural features perform equal or better than single-channel features.

02

Binaural features reduce error rates in sound event detection.

03

Evaluation on TUT Sound Events 2017 dataset confirms effectiveness.

Abstract

In this paper, we compare the performance of using binaural audio features in place of single-channel features for sound event detection. Three different binaural features are studied and evaluated on the publicly available TUT Sound Events 2017 dataset of length 70 minutes. Sound event detection is performed separately with single-channel and binaural features using stacked convolutional and recurrent neural network and the evaluation is reported using standard metrics of error rate and F-score. The studied binaural features are seen to consistently perform equal to or better than the single-channel features with respect to error rate metric.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis