Binaural Signal Representations for Joint Sound Event Detection and   Acoustic Scene Classification

Daniel Aleksander Krause; Annamaria Mesaros

arXiv:2209.05900·cs.SD·September 14, 2022·1 cites

Binaural Signal Representations for Joint Sound Event Detection and Acoustic Scene Classification

Daniel Aleksander Krause, Annamaria Mesaros

PDF

Open Access

TL;DR

This paper explores the use of binaural spatial audio features in a joint deep learning model to improve sound event detection and acoustic scene classification, demonstrating that specific features enhance performance.

Contribution

It introduces the use of binaural features like GCC-phat and phase differences in a joint DNN for SED and ASC, showing improved results over baseline methods.

Findings

01

Binaural features improve SED and ASC performance.

02

Joint training benefits from spatial audio features.

03

Specific binaural features outperform logmel energies.

Abstract

Sound event detection (SED) and Acoustic scene classification (ASC) are two widely researched audio tasks that constitute an important part of research on acoustic scene analysis. Considering shared information between sound events and acoustic scenes, performing both tasks jointly is a natural part of a complex machine listening system. In this paper, we investigate the usefulness of several spatial audio features in training a joint deep neural network (DNN) model performing SED and ASC. Experiments are performed for two different datasets containing binaural recordings and synchronous sound event and acoustic scene labels to analyse the differences between performing SED and ASC separately or jointly. The presented results show that the use of specific binaural features, mainly the Generalized Cross Correlation with Phase Transform (GCC-phat) and sines and cosines of phase…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Diverse Musicological Studies