APEX: Audio Prototype EXplanations for Classification Tasks
Piotr Kawa, Kornel Howil, Piotr Borycki, Mi{\l}osz Adamczyk, Przemys{\l}aw Spurek, Piotr Syga

TL;DR
APEX is a novel post-hoc interpretability framework for pre-trained audio classifiers, providing intuitive, multi-perspective explanations that respect acoustic properties without fine-tuning.
Contribution
It introduces APEX, a method that offers multi-faceted, example-based explanations for audio classification, addressing limitations of vision-based attribution techniques.
Findings
APEX provides four distinct explanation perspectives: square, time, frequency, and time-frequency.
APEX yields more semantically meaningful explanations than gradient-based methods.
The framework does not require fine-tuning of the original audio classifier.
Abstract
Explainable AI (XAI) has achieved remarkable success in image classification, yet the audio domain lacks equally mature solutions. Current methods apply vision-based attribution techniques to spectrograms, overlooking fundamental differences between visual and acoustic signals. While prototype reasoning is promising, acoustic similarity remains multidimensional. We introduce APEX (Audio Prototype EXplanations), a post-hoc framework for interpreting pre-trained audio classifiers. Crucially, APEX requires no fine-tuning of the original backbone and strictly preserves output invariance. APEX disentangles explanations into four perspectives: Square-based prototypes to localize transient events, Time-based for temporal patterns, Frequency-based highlighting spectral bands, and Time-Frequency-based integrating both. This yields intuitive, example-based explanations that respect acoustic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
