Densely Connected CNNs for Bird Audio Detection
Thomas Pellegrini

TL;DR
This paper evaluates various convolutional neural network architectures for bird audio detection, demonstrating that densely connected networks achieve the best performance in a competitive challenge setting.
Contribution
It introduces the effectiveness of DenseNets for bird sound detection and compares them with other CNN variants within a standardized challenge framework.
Findings
DenseNets achieved 88.22% AUC score.
Data augmentation and ensemble methods improved performance.
Enlarging training data with pseudo-labels degraded results.
Abstract
Detecting bird sounds in audio recordings automatically, if accurate enough, is expected to be of great help to the research community working in bio- and ecoacoustics, interested in monitoring biodiversity based on audio field recordings. To estimate how accurate the state-of-the-art machine learning approaches are, the Bird Audio Detection challenge involving large audio datasets was recently organized. In this paper, experiments using several types of convolutional neural networks (i.e. standard CNNs, residual nets and densely connected nets) are reported in the framework of this challenge. DenseNets were the preferred solution since they were the best performing and most compact models, leading to a 88.22% area under the receiver operator curve score on the test set of the challenge. Performance gains were obtained thank to data augmentation through time and frequency shifting,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
