Exploring bat song syllable representations in self-supervised audio   encoders

Marianne de Heer Kloots; Mirjam Kn\"ornschild

arXiv:2409.12634·cs.SD·September 20, 2024·2 cites

Exploring bat song syllable representations in self-supervised audio encoders

Marianne de Heer Kloots, Mirjam Kn\"ornschild

PDF

Open Access

TL;DR

This study investigates how self-supervised audio encoders trained on human speech represent bat song syllables, revealing that models trained on human sounds produce the most distinctive representations, aiding cross-species transfer learning.

Contribution

It demonstrates the potential of using human speech-trained models for analyzing bat vocalizations, advancing cross-species bioacoustic applications.

Findings

01

Models pre-trained on human speech produce distinctive syllable representations.

02

Cross-species transfer learning can be applied to bat bioacoustics.

03

Insights into out-of-distribution signal processing in audio models.

Abstract

How well can deep learning models trained on human-generated sounds distinguish between another species' vocalization types? We analyze the encoding of bat song syllables in several self-supervised audio encoders, and find that models pre-trained on human speech generate the most distinctive representations of different syllable types. These findings form first steps towards the application of cross-species transfer learning in bat bioacoustics, as well as an improved understanding of out-of-distribution signal processing in audio encoder models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnimal Vocal Communication and Behavior · Bat Biology and Ecology Studies · Marine animal studies overview