Joint Analysis of Acoustic Scenes and Sound Events with Weakly labeled   Data

Shunsuke Tsubaki; Keisuke Imoto; Nobutaka Ono

arXiv:2207.04357·cs.SD·July 12, 2022·1 cites

Joint Analysis of Acoustic Scenes and Sound Events with Weakly labeled Data

Shunsuke Tsubaki, Keisuke Imoto, Nobutaka Ono

PDF

Open Access

TL;DR

This paper introduces a multi-task learning approach for joint acoustic scene and sound event analysis using weak labels, reducing annotation effort while improving performance over traditional methods.

Contribution

It proposes a novel MTL framework with weak labels and evaluates multiple pooling functions, demonstrating superior results in scene and event detection tasks.

Findings

01

Weakly supervised MTL outperforms single-task models.

02

Multiple pooling functions are evaluated for effectiveness.

03

The method improves both scene classification and event detection accuracy.

Abstract

Considering that acoustic scenes and sound events are closely related to each other, in some previous papers, a joint analysis of acoustic scenes and sound events utilizing multitask learning (MTL)-based neural networks was proposed. In conventional methods, a strongly supervised scheme is applied to sound event detection in MTL models, which requires strong labels of sound events in model training; however, annotating strong event labels is quite time-consuming. In this paper, we thus propose a method for the joint analysis of acoustic scenes and sound events based on the MTL framework with weak labels of sound events. In particular, in the proposed method, we introduce the multiple-instance learning scheme for weakly supervised training of sound event detection and evaluate four pooling functions, namely, max pooling, average pooling, exponential softmax pooling, and attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies