What Do Neurons Listen To? A Neuron-level Dissection of a General-purpose Audio Model
Takao Kawamura, Daisuke Niizumi, Nobutaka Ono

TL;DR
This paper investigates the internal neuron-level representations of a general-purpose audio SSL model, revealing class-specific neurons that contribute to its robust generalization across diverse audio tasks.
Contribution
It provides the first systematic neuron-level analysis of a general-purpose audio SSL model, uncovering class-specific neurons and their role in model performance.
Findings
Identification of class-specific neurons across tasks
Shared responses among neurons for different semantic categories
Confirmation of neurons' impact on classification accuracy
Abstract
In this paper, we analyze the internal representations of a general-purpose audio self-supervised learning (SSL) model from a neuron-level perspective. Despite their strong empirical performance as feature extractors, the internal mechanisms underlying the robust generalization of SSL audio models remain unclear. Drawing on the framework of mechanistic interpretability, we identify and examine class-specific neurons by analyzing conditional activation patterns across diverse tasks. Our analysis reveals that SSL models foster the emergence of class-specific neurons that provide extensive coverage across novel task classes. These neurons exhibit shared responses across different semantic categories and acoustic similarities, such as speech attributes and musical pitch. We also confirm that these neurons have a functional impact on classification performance. To our knowledge, this is the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Neuroscience and Music Perception · Emotion and Mood Recognition
