Learning to rumble: Automated elephant call classification, detection and endpointing using deep architectures
Christiaan M. Geldenhuys, Thomas R. Niesler

TL;DR
This paper introduces a novel deep learning approach using transformer architectures for automatic detection, classification, and endpointing of elephant calls in audio recordings, significantly advancing conservation technology.
Contribution
It presents the first application of audio spectrogram transformers with transfer learning for elephant call detection and sub-call classification, achieving new performance benchmarks.
Findings
Achieved an average precision of 0.962 for call detection.
Attained AUC scores of 0.957 and 0.979 for call and sub-call classification.
Demonstrated that transformer models outperform previous shallow and deep classifiers.
Abstract
We consider the problem of detecting, isolating and classifying elephant calls in continuously recorded audio. Such automatic call characterisation can assist conservation efforts and inform environmental management strategies. In contrast to previous work in which call detection was performed at a segment level, we perform call detection at a frame level which implicitly also allows call endpointing, the isolation of a call in a longer recording. For experimentation, we employ two annotated datasets, one containing Asian and the other African elephant vocalisations. We evaluate several shallow and deep classifier models, and show that the current best performance can be improved by using an audio spectrogram transformer (AST), a neural architecture which has not been used for this purpose before, and which we have configured in a novel sequence-to-sequence manner. We also show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Complex Network Analysis Techniques
