Domain-Invariant Representation Learning of Bird Sounds
Ilyass Moummad, Romain Serizel, Emmanouil Benetos, Nicolas Farrugia

TL;DR
This paper introduces ProtoCLR, a contrastive learning method that enhances domain-invariant bird sound representations for passive acoustic monitoring, improving model generalization across different recording environments.
Contribution
It proposes ProtoCLR, a novel contrastive learning approach using class prototypes, to address domain shift in bioacoustic bird sound classification.
Findings
ProtoCLR outperforms SupCon in domain generalization tasks.
The method achieves better few-shot classification accuracy.
ProtoCLR reduces computational complexity compared to traditional contrastive loss.
Abstract
Passive acoustic monitoring (PAM) is crucial for bioacoustic research, enabling non-invasive species tracking and biodiversity monitoring. Citizen science platforms provide large annotated datasets from focal recordings, where the target species is intentionally recorded. However, PAM requires monitoring in passive soundscapes, creating a domain shift between focal and passive recordings, challenging deep learning models trained on focal recordings. To address domain generalization, we leverage supervised contrastive learning by enforcing domain invariance across same-class examples from different domains. Additionally, we propose ProtoCLR, an alternative to SupCon loss which reduces the computational complexity by comparing examples to class prototypes instead of pairwise comparisons. We conduct few-shot classification based on BIRB, a large-scale bird sound benchmark to assess…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
