Sound Event Detection Based on Curriculum Learning Considering Learning Difficulty of Events
Noriyuki Tonami, Keisuke Imoto, Yuki Okamoto, Takahiro, Fukumori, Yoichi Yamashita

TL;DR
This paper introduces a curriculum learning approach for sound event detection that trains models from easy to difficult events, significantly improving detection accuracy over conventional methods.
Contribution
The paper proposes a novel objective function for SED that incorporates curriculum learning based on event difficulty, enhancing model performance.
Findings
F-score improved by 10.09 percentage points
Effective utilization of event difficulty in training
Significant performance gain over conventional methods
Abstract
In conventional sound event detection (SED) models, two types of events, namely, those that are present and those that do not occur in an acoustic scene, are regarded as the same type of events. The conventional SED methods cannot effectively exploit the difference between the two types of events. All time frames of sound events that do not occur in an acoustic scene are easily regarded as inactive in the scene, that is, the events are easy-to-train. The time frames of the events that are present in a scene must be classified as active in addition to inactive in the acoustic scene, that is, the events are difficult-to-train. To take advantage of the training difficulty, we apply curriculum learning into SED, where models are trained from easy- to difficult-to-train events. To utilize the curriculum learning, we propose a new objective function for SED, wherein the events are trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
