AudioProtoPNet: An interpretable deep learning model for bird sound   classification

Ren\'e Heinrich; Lukas Rauch; Bernhard Sick; Christoph Scholz

arXiv:2404.10420·cs.LG·November 14, 2024·2 cites

AudioProtoPNet: An interpretable deep learning model for bird sound classification

Ren\'e Heinrich, Lukas Rauch, Bernhard Sick, Christoph Scholz

PDF

Open Access

TL;DR

AudioProtoPNet is an interpretable deep learning model for bird sound classification that uses prototype learning to provide explanations and insights, outperforming previous models on multiple datasets.

Contribution

This paper introduces AudioProtoPNet, a novel interpretable deep learning approach for multi-label bird sound classification using prototype learning.

Findings

01

Outperforms state-of-the-art model Perch with 7.1% higher AUROC

02

Achieves 16.7% higher cmAP over Perch

03

Provides explanations for model decisions and insights into bird vocalizations

Abstract

Deep learning models have significantly advanced acoustic bird monitoring by being able to recognize numerous bird species based on their vocalizations. However, traditional deep learning models are black boxes that provide no insight into their underlying computations, limiting their usefulness to ornithologists and machine learning engineers. Explainable models could facilitate debugging, knowledge discovery, trust, and interdisciplinary collaboration. This study introduces AudioProtoPNet, an adaptation of the Prototypical Part Network (ProtoPNet) for multi-label bird sound classification. It is an inherently interpretable model that uses a ConvNeXt backbone to extract embeddings, with the classification layer replaced by a prototype learning classifier trained on these embeddings. The classifier learns prototypical patterns of each bird species' vocalizations from spectrograms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnimal Vocal Communication and Behavior · Music and Audio Processing · Diverse Musicological Studies

MethodsConvNeXt