Audio Hallucination Attacks: Probing the Reliability of Large Audio Language Models

Ashish Seth; Sonal Kumar; Ramaneswaran Selvakumar; Nishit Anand; Utkarsh Tyagi; Prem Seetharaman; Ramani Duraiswami; Dinesh Manocha

arXiv:2603.29263·cs.SD·April 1, 2026

Audio Hallucination Attacks: Probing the Reliability of Large Audio Language Models

Ashish Seth, Sonal Kumar, Ramaneswaran Selvakumar, Nishit Anand, Utkarsh Tyagi, Prem Seetharaman, Ramani Duraiswami, Dinesh Manocha

PDF

TL;DR

This paper introduces Audio Hallucination Attacks (AHA) to test the reliability of Large Audio Language Models, revealing significant vulnerabilities and proposing a dataset to mitigate these issues.

Contribution

The authors develop AHA, a comprehensive attack suite for LALMs, and create AHA-Guard, a large dataset to improve model robustness against hallucination attacks.

Findings

01

State-of-the-art LALMs have high attack success rates (up to 95%)

02

AHA-Guard reduces attack success rates by up to 49%

03

Standard benchmarks do not reveal these vulnerabilities

Abstract

Large Audio Language Models (LALMs) achieve strong performance on audio-language tasks; however, their reliability in real-world settings remains underexplored. We introduce Audio Hallucination Attacks (AHA), an attack suite called AHA-Eval, comprising 6.5K QA pairs designed to test whether LALMs genuinely ground their responses in the audio input. AHA targets two attack surfaces: (i) query-based attacks, which exploit question structure to induce hallucinations about absent sounds, and (ii) audio-based attacks, which inject synthetic speech describing non-existent events into the audio stream. Evaluating state-of-the-art LALMs, including Audio Flamingo 3 and Gemini 3 Pro, we observe high attack success rates of 95.35% and 79.65%, respectively, revealing a reliability gap that is hidden by standard benchmark performance. To mitigate this, we propose a 120K QA post-alignment dataset,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.