Data augmentation approaches for improving animal audio classification

Loris Nanni; Gianluca Maguolo; Michelangelo Paci

arXiv:1912.07756·cs.LG·March 17, 2020

Data augmentation approaches for improving animal audio classification

Loris Nanni, Gianluca Maguolo, Michelangelo Paci

PDF

TL;DR

This study explores various data augmentation techniques combined with CNN ensembles to improve animal audio classification accuracy, demonstrating superior results over existing methods on bird and cat sound datasets.

Contribution

It introduces a comprehensive comparison of data augmentation protocols for CNN-based animal audio classification, establishing new state-of-the-art results without specialized parameter tuning.

Findings

01

Ensemble classifiers outperform individual CNNs.

02

Data augmentation significantly improves recognition rates.

03

Fusion of multiple CNNs yields better results than single models.

Abstract

In this paper we present ensembles of classifiers for automated animal audio classification, exploiting different data augmentation techniques for training Convolutional Neural Networks (CNNs). The specific animal audio classification problems are i) birds and ii) cat sounds, whose datasets are freely available. We train five different CNNs on the original datasets and on their versions augmented by four augmentation protocols, working on the raw audio signals or their representations as spectrograms. We compared our best approaches with the state of the art, showing that we obtain the best recognition rate on the same datasets, without ad hoc parameter optimization. Our study shows that different CNNs can be trained for the purpose of animal audio classification and that their fusion works better than the stand-alone classifiers. To the best of our knowledge this is the largest study…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.