SEED: Sound Event Early Detection via Evidential Uncertainty
Xujiang Zhao, Xuchao Zhang, Wei Cheng, Wenchao Yu, Yuncong Chen,, Haifeng Chen, Feng Chen

TL;DR
This paper introduces PENet, a novel neural network that models evidential uncertainty for early sound event detection, improving reliability and accuracy in real-time acoustic environment recognition.
Contribution
The paper proposes a new evidential neural network with Beta distribution modeling and a backtrack inference method for enhanced early sound event detection.
Findings
Improved detection F1 score by 3.8% over state-of-the-art.
Reduced time delay by 13.0%.
Enhanced uncertainty modeling with evidential approach.
Abstract
Sound Event Early Detection (SEED) is an essential task in recognizing the acoustic environments and soundscapes. However, most of the existing methods focus on the offline sound event detection, which suffers from the over-confidence issue of early-stage event detection and usually yield unreliable results. To solve the problem, we propose a novel Polyphonic Evidential Neural Network (PENet) to model the evidential uncertainty of the class probability with Beta distribution. Specifically, we use a Beta distribution to model the distribution of class probabilities, and the evidential uncertainty enriches uncertainty representation with evidence information, which plays a central role in reliable prediction. To further improve the event detection performance, we design the backtrack inference method that utilizes both the forward and backward audio features of an ongoing event.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Acoustic Wave Phenomena Research
