BioME: A Resource-Efficient Bioacoustic Foundational Model for IoT Applications
Heitor R. Guimar\~aes, Abhishek Tiwari, Mahsa Abdollahi, Anderson R. Avila, Tiago H. Falk

TL;DR
BioME is a lightweight, resource-efficient bioacoustic encoder trained through distillation and multi-domain pretraining, achieving high performance in biodiversity monitoring tasks suitable for IoT devices.
Contribution
The paper introduces BioME, a novel bioacoustic encoder that reduces model size by 75% using layer-to-layer distillation and incorporates DSP-inspired features for improved ecological generalization.
Findings
BioME matches or exceeds larger models in bioacoustic tasks.
It reduces parameter count by 75% compared to its teacher model.
BioME is suitable for deployment on resource-constrained IoT platforms.
Abstract
Passive acoustic monitoring has become a key strategy in biodiversity assessment, conservation, and behavioral ecology, especially as Internet-of-Things (IoT) devices enable continuous in situ audio collection at scale. While recent self-supervised learning (SSL)-based audio encoders, such as BEATs and AVES, have shown strong performance in bioacoustic tasks, their computational cost and limited robustness to unseen environments hinder deployment on resource-constrained platforms. In this work, we introduce BioME, a resource-efficient audio encoder designed for bioacoustic applications. BioME is trained via layer-to-layer distillation from a high-capacity teacher model, enabling strong representational transfer while reducing the parameter count by 75%. To further improve ecological generalization, the model is pretrained on multi-domain data spanning speech, environmental sounds, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnimal Vocal Communication and Behavior · Music and Audio Processing · Speech and Audio Processing
