BAT: Better Audio Transformer Guided by Convex Gated Probing
Houtan Ghaffari, Lukas Rauch, Christoph Scholz, Paul Devos

TL;DR
This paper introduces Convex Gated Probing (CGP), a prototype-based method that enhances probing of audio SSL models, leading to the development of Better Audio Transformer (BAT) which achieves new state-of-the-art results on audio benchmarks.
Contribution
The paper presents CGP as a robust probing technique for audio SSL, and leverages it to improve model design and training, resulting in superior performance.
Findings
CGP effectively utilizes all frozen layers in audio SSL models.
BAT achieves new state-of-the-art results on audio benchmarks.
Refined SSL pipeline improves model performance and reliability.
Abstract
Probing is widely adopted in computer vision to faithfully evaluate self-supervised learning (SSL) embeddings, as fine-tuning may misrepresent their inherent quality. In contrast, audio SSL models still rely on fine-tuning because simple probing fails to unlock their full potential and alters their rankings when competing for SOTA on AudioSet. Hence, a robust and efficient probing mechanism is required to guide the trajectory of audio SSL towards reliable and reproducible methods. We introduce Convex Gated Probing (CGP), a prototype-based method that drastically closes the gap between fine-tuning and probing in audio. CGP efficiently utilizes all frozen layers via a gating mechanism and exposes the location of latent task-relevant information. Guided by CGP, we rework the entire SSL pipeline of current SOTA audio models that use legacy implementations of prior SSL methods. By refining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
