TL;DR
This paper demonstrates that lightweight EfficientNet models combined with parameter-efficient fine-tuning and data augmentation can achieve high accuracy in UAV audio classification with limited data, outperforming other architectures.
Contribution
It introduces a novel approach integrating EfficientNet with PEFT and augmentation for UAV audio classification, addressing data scarcity challenges.
Findings
EfficientNet-B0 with three augmentations achieved 95.95% accuracy.
Full fine-tuning outperformed partial fine-tuning and other models.
Lightweight architectures with PEFT are effective for limited UAV audio datasets.
Abstract
As unmanned aerial vehicles (UAVs) become increasingly prevalent in both consumer and defense applications, the need for reliable, modality-specific classification systems grows in urgency. This paper addresses the challenge of data scarcity in UAV audio classification by expanding on prior work through the integration of pre-trained deep learning models, parameter-efficient fine-tuning (PEFT) strategies, and targeted data augmentation techniques. Using a custom dataset of 3,100 UAV audio clips (15,500 seconds) spanning 31 distinct drone types, we evaluate the performance of transformer-based and convolutional neural network (CNN) architectures under various fine-tuning configurations. Experiments were conducted with five-fold cross-validation, assessing accuracy, training efficiency, and robustness. Results show that full fine-tuning of the EfficientNet-B0 model with three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
