AR&D: A Framework for Retrieving and Describing Concepts for Interpreting AudioLLMs

Townim Faisal Chowdhury; Ta Duc Huy; Siqi Pan; Jeremy Stoddard; Zhibin Liao

arXiv:2602.22253·cs.SD·February 27, 2026

AR&D: A Framework for Retrieving and Describing Concepts for Interpreting AudioLLMs

Townim Faisal Chowdhury, Ta Duc Huy, Siqi Pan, Jeremy Stoddard, Zhibin Liao

PDF

Open Access

TL;DR

This paper introduces a novel interpretability framework for AudioLLMs using sparse autoencoders to disentangle complex activations, improving transparency and enabling better control over model behavior.

Contribution

It presents the first mechanistic interpretability approach for AudioLLMs, facilitating the identification and validation of meaningful audio concepts through automated and human evaluation methods.

Findings

01

AudioLLMs encode structured, interpretable features

02

The framework improves transparency and control

03

It lays groundwork for trustworthy deployment in high-stakes domains

Abstract

Despite strong performance in audio perception tasks, large audio-language models (AudioLLMs) remain opaque to interpretation. A major factor behind this lack of interpretability is that individual neurons in these models frequently activate in response to several unrelated concepts. We introduce the first mechanistic interpretability framework for AudioLLMs, leveraging sparse autoencoders (SAEs) to disentangle polysemantic activations into monosemantic features. Our pipeline identifies representative audio clips, assigns meaningful names via automated captioning, and validates concepts through human evaluation and steering. Experiments show that AudioLLMs encode structured and interpretable features, enhancing transparency and control. This work provides a foundation for trustworthy deployment in high-stakes domains and enables future extensions to larger models, multilingual audio,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Music and Audio Processing · Speech Recognition and Synthesis