Automatic Bat Call Classification using Transformer Networks
Frank Fundel, Daniel A. Braun, Sebastian Gottwald

TL;DR
This paper introduces a Transformer-based model for automatic bat call classification, capable of multi-label detection in real-time, trained on synthetic multi-species recordings, and outperforming existing tools on public datasets.
Contribution
The paper presents a novel Transformer architecture for multi-label bat call classification, trained on synthetic data, with improved accuracy and real-time potential.
Findings
Achieved 88.92% single species accuracy and 74.40% macro F1-score.
Outperformed existing tools by at least 25.82% in accuracy.
Demonstrated effectiveness on public datasets like ChiroVox.
Abstract
Automatically identifying bat species from their echolocation calls is a difficult but important task for monitoring bats and the ecosystem they live in. Major challenges in automatic bat call identification are high call variability, similarities between species, interfering calls and lack of annotated data. Many currently available models suffer from relatively poor performance on real-life data due to being trained on single call datasets and, moreover, are often too slow for real-time classification. Here, we propose a Transformer architecture for multi-label classification with potential applications in real-time classification scenarios. We train our model on synthetically generated multi-species recordings by merging multiple bats calls into a single recording with multiple simultaneous calls. Our approach achieves a single species accuracy of 88.92% (F1-score of 84.23%) and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Dense Connections · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Residual Connection · Adam · Linear Layer · Dropout
