Ensemble Confidence Calibration for Sound Event Detection in Open-environment

Yuanjian Chen; Han Yin

arXiv:2507.09606·cs.SD·July 15, 2025

Ensemble Confidence Calibration for Sound Event Detection in Open-environment

Yuanjian Chen, Han Yin

PDF

Open Access

TL;DR

This paper introduces an ensemble-based confidence calibration method called EOW-Softmax for sound event detection in open environments, enhancing robustness and uncertainty measurement in real-world scenarios.

Contribution

It is the first to apply ensemble methods and EOW-Softmax calibration to improve sound event detection robustness against out-of-domain inputs.

Findings

01

Improved detection performance in open environments.

02

Reduced overconfidence in predictions.

03

Enhanced handling of out-of-domain scenarios.

Abstract

Sound event detection (SED) has made strong progress in controlled environments with clear event categories. However, real-world applications often take place in open environments. In such cases, current methods often produce predictions with too much confidence and lack proper ways to measure uncertainty. This limits their ability to adapt and perform well in new situations. To solve this problem, we are the first to use ensemble methods in SED to improve robustness against out-of-domain (OOD) inputs. We propose a confidence calibration method called Energy-based Open-World Softmax (EOW-Softmax), which helps the system better handle uncertainty in unknown scenes. We further apply EOW-Softmax to sound occurrence and overlap detection (SOD) by adjusting the prediction. In this way, the model becomes more adaptable while keeping its ability to detect overlapping events. Experiments show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis