Multi-layer attentive probing improves transfer of audio representations for bioacoustics

Marius Miron; David Robinson; Masato Hagiwara; Titouan Parcollet; Jules Cauzinille; Gagan Narula; Milad Alizadeh; Ellen Gilsenan-McMahon; Sara Keen; Emmanuel Chemla; Benjamin Hoffman; Maddie Cusimano; Diane Kim; Felix Effenberger; Jane K. Lawton; Aza Raskin; Olivier Pietquin; Matthieu Geist

arXiv:2605.10494·cs.SD·May 12, 2026

Multi-layer attentive probing improves transfer of audio representations for bioacoustics

Marius Miron, David Robinson, Masato Hagiwara, Titouan Parcollet, Jules Cauzinille, Gagan Narula, Milad Alizadeh, Ellen Gilsenan-McMahon, Sara Keen, Emmanuel Chemla, Benjamin Hoffman, Maddie Cusimano, Diane Kim, Felix Effenberger, Jane K. Lawton, Aza Raskin, Olivier Pietquin

PDF

TL;DR

This paper investigates how different probing strategies, especially multi-layer and attention-based probes, enhance the evaluation of audio representations in bioacoustic tasks, revealing that current benchmarks may underestimate encoder quality.

Contribution

It systematically compares probing methods across bioacoustic benchmarks, demonstrating that multi-layer and attention probes improve evaluation accuracy over standard linear, last-layer probes.

Findings

01

Multi-layer probing improves downstream task performance.

02

Attention probes outperform linear probes for transformer models.

03

Larger, time-aware probe heads yield better results.

Abstract

Probing heads map the representations learned from audio by a machine learning model to downstream task labels and are a key component in evaluating representation learning. Most bioacoustic benchmarks use a fixed, low-capacity probe, such as a linear layer on the final encoder layer. While this standardization enables model comparisons, it may bias results by overlooking the interaction between encoder features and probe design. In this work, we systematically study different probing strategies across two bioacoustic benchmarks, BEANs and BirdSet. We evaluate last- and multi-layer probing, across linear and attention probes. We show that larger probe heads that leverage time information have superior performance. Our results suggest that current benchmarks may misrepresent encoder quality when relying on a last-layer probing setup. Multi-layer probing improves downstream task…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.