APEX: Audio Prototype EXplanations for Classification Tasks

Piotr Kawa; Kornel Howil; Piotr Borycki; Mi{\l}osz Adamczyk; Przemys{\l}aw Spurek; Piotr Syga

arXiv:2605.10153·cs.SD·May 12, 2026

APEX: Audio Prototype EXplanations for Classification Tasks

Piotr Kawa, Kornel Howil, Piotr Borycki, Mi{\l}osz Adamczyk, Przemys{\l}aw Spurek, Piotr Syga

PDF

TL;DR

APEX is a novel post-hoc interpretability framework for pre-trained audio classifiers, providing intuitive, multi-perspective explanations that respect acoustic properties without fine-tuning.

Contribution

It introduces APEX, a method that offers multi-faceted, example-based explanations for audio classification, addressing limitations of vision-based attribution techniques.

Findings

01

APEX provides four distinct explanation perspectives: square, time, frequency, and time-frequency.

02

APEX yields more semantically meaningful explanations than gradient-based methods.

03

The framework does not require fine-tuning of the original audio classifier.

Abstract

Explainable AI (XAI) has achieved remarkable success in image classification, yet the audio domain lacks equally mature solutions. Current methods apply vision-based attribution techniques to spectrograms, overlooking fundamental differences between visual and acoustic signals. While prototype reasoning is promising, acoustic similarity remains multidimensional. We introduce APEX (Audio Prototype EXplanations), a post-hoc framework for interpreting pre-trained audio classifiers. Crucially, APEX requires no fine-tuning of the original backbone and strictly preserves output invariance. APEX disentangles explanations into four perspectives: Square-based prototypes to localize transient events, Time-based for temporal patterns, Frequency-based highlighting spectral bands, and Time-Frequency-based integrating both. This yields intuitive, example-based explanations that respect acoustic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.