Self-Supervised Learning for Few-Shot Bird Sound Classification

Ilyass Moummad; Romain Serizel; Nicolas Farrugia

arXiv:2312.15824·cs.SD·February 12, 2024·1 cites

Self-Supervised Learning for Few-Shot Bird Sound Classification

Ilyass Moummad, Romain Serizel, Nicolas Farrugia

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that self-supervised learning can effectively generate meaningful bird sound representations from unlabeled audio data, enabling accurate few-shot classification and improving results through targeted window selection.

Contribution

It introduces a self-supervised learning approach for bird sound classification that does not require annotations and enhances performance by selecting high-activation audio segments.

Findings

01

SSL learns meaningful bird sound representations from unlabeled data.

02

Representations generalize well to new bird species in few-shot scenarios.

03

Selecting high-activation windows improves representation quality.

Abstract

Self-supervised learning (SSL) in audio holds significant potential across various domains, particularly in situations where abundant, unlabeled data is readily available at no cost. This is pertinent in bioacoustics, where biologists routinely collect extensive sound datasets from the natural environment. In this study, we demonstrate that SSL is capable of acquiring meaningful representations of bird sounds from audio recordings without the need for annotations. Our experiments showcase that these learned representations exhibit the capacity to generalize to new bird species in few-shot learning (FSL) scenarios. Additionally, we show that selecting windows with high bird activation for self-supervised learning, using a pretrained audio neural network, significantly enhances the quality of the learned representations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ilyassmoummad/ssl4birdsounds
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnimal Vocal Communication and Behavior · Marine animal studies overview