Uncertainty Calibration of Multi-Label Bird Sound Classifiers
Raphael Schwinger, Ben McEwen, Vincent S. Kather, Ren\'e Heinrich, Lukas Rauch, Sven Tomforde

TL;DR
This paper evaluates the calibration of multi-label bird sound classifiers, revealing significant variability across datasets and classes, and demonstrates simple methods to improve calibration using small calibration sets.
Contribution
It systematically benchmarks calibration of state-of-the-art bioacoustic classifiers and shows how post hoc calibration improves uncertainty estimates.
Findings
Model calibration varies significantly across datasets and classes.
Simple post hoc calibration methods can significantly improve calibration.
Less frequent classes tend to be better calibrated.
Abstract
Passive acoustic monitoring enables large-scale biodiversity assessment, but reliable classification of bioacoustic sounds requires not only high accuracy but also well-calibrated uncertainty estimates to ground decision-making. In bioacoustics, calibration is challenged by overlapping vocalisations, long-tailed species distributions, and distribution shifts between training and deployment data. The calibration of multi-label deep learning classifiers within the domain of bioacoustics has not yet been assessed. We systematically benchmark the calibration of four state-of-the-art multi-label bird sound classifiers on the BirdSet benchmark, evaluating both global, per-dataset and per-class calibration using threshold-free calibration metrics (ECE, MCS) alongside discrimination metrics (cmAP). Model calibration varies significantly across datasets and classes. While Perch v2 and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnimal Vocal Communication and Behavior · Amphibian and Reptile Biology · Wildlife-Road Interactions and Conservation
