How Information on Acoustic Scenes and Sound Events Mutually Benefits   Event Detection and Scene Classification Tasks

Keisuke Imoto; Yuka Komatsu; Shunsuke Tsubaki; Tatsuya Komatsu

arXiv:2204.02279·cs.SD·April 6, 2022

How Information on Acoustic Scenes and Sound Events Mutually Benefits Event Detection and Scene Classification Tasks

Keisuke Imoto, Yuka Komatsu, Shunsuke Tsubaki, Tatsuya Komatsu

PDF

Open Access

TL;DR

This paper investigates how acoustic scene and sound event information mutually enhance each other's detection and classification tasks using domain adversarial training and fake labels, revealing that single-task methods may implicitly leverage this mutual information.

Contribution

It provides a detailed analysis of the mutual benefits between acoustic scene classification and sound event detection, highlighting the effectiveness of domain adversarial training and fake-label methods.

Findings

01

Mutual information improves SED and ASC performance.

02

Single-task methods outperform joint methods in experiments.

03

Implicit mutual benefits are observed even with single-task approaches.

Abstract

Acoustic scene classification (ASC) and sound event detection (SED) are fundamental tasks in environmental sound analysis, and many methods based on deep learning have been proposed. Considering that information on acoustic scenes and sound events helps SED and ASC mutually, some researchers have proposed a joint analysis of acoustic scenes and sound events by multitask learning (MTL). However, conventional works have not investigated in detail how acoustic scenes and sound events mutually benefit SED and ASC. We, therefore, investigate the impact of information on acoustic scenes and sound events on the performance of SED and ASC by using domain adversarial training based on a gradient reversal layer (GRL) or model training with fake labels. Experimental results obtained using the TUT Acoustic Scenes 2016/2017 and TUT Sound Events 2016/2017 show that pieces of information on acoustic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Diverse Musicological Studies · Speech and Audio Processing