Towards small and accurate convolutional neural networks for acoustic biodiversity monitoring
Serge Zaugg, Mike van der Schaar, Florence Erbs, Antonio Sanchez, Joan, V. Castell, Emiliano Ramallo, Michel Andr\'e

TL;DR
This paper presents the design of small, fast, and accurate CNNs with a novel frequency unwrapping layer for efficient acoustic biodiversity monitoring, achieving high performance with moderate training data and suitable for low-cost hardware.
Contribution
The introduction of SIMP-FU models with a frequency unwrapping layer and optimized receptive field for improved classification of animal sounds from moderate data.
Findings
Models with 1.5s RF duration perform best.
Achieved over 0.95 AUC in 18 of 20 classes.
Models run up to seven times faster than real-time on low-cost hardware.
Abstract
Automated classification of animal sounds is a prerequisite for large-scale monitoring of biodiversity. Convolutional Neural Networks (CNNs) are among the most promising algorithms but they are slow, often achieve poor classification in the field and typically require large training data sets. Our objective was to design CNNs that are fast at inference time and achieve good classification performance while learning from moderate-sized data. Recordings from a rainforest ecosystem were used. Start and end-point of sounds from 20 bird species were manually annotated. Spectrograms from 10 second segments were used as CNN input. We designed simple CNNs with a frequency unwrapping layer (SIMP-FU models) such that any output unit was connected to all spectrogram frequencies but only to a sub-region of time, the Receptive Field (RF). Our models allowed experimentation with different RF…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnimal Vocal Communication and Behavior · Marine animal studies overview · Music and Audio Processing
