SpurAudio: A Benchmark for Studying Shortcut Learning in Few-Shot Audio Classification

Giries Abu Ayoub; Morad Tukan; Loay Mualem

arXiv:2605.13672·cs.CV·May 14, 2026

SpurAudio: A Benchmark for Studying Shortcut Learning in Few-Shot Audio Classification

Giries Abu Ayoub, Morad Tukan, Loay Mualem

PDF

TL;DR

SpurAudio is a new benchmark designed to evaluate how few-shot audio classification models rely on contextual cues, revealing their vulnerabilities to spurious correlations and highlighting the importance of context-aware evaluation.

Contribution

The paper introduces SpurAudio, a benchmark that enables controlled assessment of contextual shifts in few-shot audio classification, exposing model vulnerabilities to background correlations.

Findings

01

State-of-the-art few-shot methods degrade when background cues are disrupted.

02

Large pretrained models are also vulnerable to context shifts.

03

Different algorithms show varying sensitivity to spurious correlations.

Abstract

Few-shot classification (FSC) is widely used for learning from limited labeled data, yet most evaluations implicitly assume that target concepts are independent of contextual cues. In real-world settings, however, examples often appear within rich contexts, allowing models to exploit spurious correlations between foreground content and background signals. While such effects have been studied in few-shot image classification, their role in few-shot audio classification remains largely unexplored, and existing audio benchmarks offer limited control over contextual structure. We introduce SpurAudio, a benchmark that leverages the natural separability of foreground events and background environments in audio to enable controlled, multi-level evaluation of contextual shifts across support and query sets. Using this benchmark, we show that many state-of-the-art few-shot methods suffer severe…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.