Audio Hallucination Attacks: Probing the Reliability of Large Audio Language Models
Ashish Seth, Sonal Kumar, Ramaneswaran Selvakumar, Nishit Anand, Utkarsh Tyagi, Prem Seetharaman, Ramani Duraiswami, Dinesh Manocha

TL;DR
This paper introduces Audio Hallucination Attacks (AHA) to test the reliability of Large Audio Language Models, revealing significant vulnerabilities and proposing a dataset to mitigate these issues.
Contribution
The authors develop AHA, a comprehensive attack suite for LALMs, and create AHA-Guard, a large dataset to improve model robustness against hallucination attacks.
Findings
State-of-the-art LALMs have high attack success rates (up to 95%)
AHA-Guard reduces attack success rates by up to 49%
Standard benchmarks do not reveal these vulnerabilities
Abstract
Large Audio Language Models (LALMs) achieve strong performance on audio-language tasks; however, their reliability in real-world settings remains underexplored. We introduce Audio Hallucination Attacks (AHA), an attack suite called AHA-Eval, comprising 6.5K QA pairs designed to test whether LALMs genuinely ground their responses in the audio input. AHA targets two attack surfaces: (i) query-based attacks, which exploit question structure to induce hallucinations about absent sounds, and (ii) audio-based attacks, which inject synthetic speech describing non-existent events into the audio stream. Evaluating state-of-the-art LALMs, including Audio Flamingo 3 and Gemini 3 Pro, we observe high attack success rates of 95.35% and 79.65%, respectively, revealing a reliability gap that is hidden by standard benchmark performance. To mitigate this, we propose a 120K QA post-alignment dataset,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
