Sound Event Detection Using Duration Robust Loss Function

Daichi Akiyama; Keisuke Imoto; Noriyuki Tonami; Yuki Okamoto; Ryosuke; Yamanishi; Takahiro Fukumori; Yoichi Yamashita

arXiv:2006.15253·cs.SD·June 30, 2020·1 cites

Sound Event Detection Using Duration Robust Loss Function

Daichi Akiyama, Keisuke Imoto, Noriyuki Tonami, Yuki Okamoto, Ryosuke, Yamanishi, Takahiro Fukumori, Yoichi Yamashita

PDF

Open Access

TL;DR

This paper introduces a duration robust loss function for sound event detection that addresses class imbalance by focusing training on short-duration events, leading to improved detection performance.

Contribution

The paper proposes a novel class-wise reweighted loss function that enhances sound event detection by accounting for event duration variability.

Findings

01

Improved detection performance by 3.15 percentage points on TUT Sound Events 2016/2017.

02

Enhanced detection accuracy by 4.37 percentage points on TUT Acoustic Scenes 2016.

03

Demonstrated effectiveness of duration-aware loss in handling class imbalance.

Abstract

Many methods of sound event detection (SED) based on machine learning regard a segmented time frame as one data sample to model training. However, the sound durations of sound events vary greatly depending on the sound event class, e.g., the sound event ``fan'' has a long time duration, while the sound event ``mouse clicking'' is instantaneous. The difference in the time duration between sound event classes thus causes a serious data imbalance problem in SED. In this paper, we propose a method for SED using a duration robust loss function, which can focus model training on sound events of short duration. In the proposed method, we focus on a relationship between the duration of the sound event and the ease/difficulty of model training. In particular, many sound events of long duration (e.g., sound event ``fan'') are stationary sounds, which have less variation in their acoustic features…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies