Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning
Dan Stowell, Mark D. Plumbley

TL;DR
This paper demonstrates that unsupervised feature learning significantly enhances large-scale bird sound classification accuracy, outperforming traditional spectral features like MFCCs, with learned features resembling avian auditory receptive fields.
Contribution
Introduces a novel unsupervised feature learning technique for bird sound classification that improves performance over traditional spectral features at large data scales.
Findings
Unsupervised features outperform MFCCs and raw Mel spectra.
Boost in classification accuracy especially for large-scale single-label tasks.
Learned features resemble avian auditory receptive fields.
Abstract
Automatic species classification of birds from their sound is a computational tool of increasing importance in ecology, conservation monitoring and vocal communication studies. To make classification useful in practice, it is crucial to improve its accuracy while ensuring that it can run at big data scales. Many approaches use acoustic measures based on spectrogram-type data, such as the Mel-frequency cepstral coefficient (MFCC) features which represent a manually-designed summary of spectral information. However, recent work in machine learning has demonstrated that features learnt automatically from data can often outperform manually-designed feature transforms. Feature learning can be performed at large scale and "unsupervised", meaning it requires no manual data labelling, yet it can improve performance on "supervised" tasks such as classification. In this work we introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
