Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment through Latent Acoustic Pattern Triggers

Liang Lin; Miao Yu; Kaiwen Luo; Yibo Zhang; Lilan Peng; Dexian Wang; Xuehai Tang; Yuanhe Zhang; Xikang Yang; Zhenhong Zhou; Kun Wang; Yang Liu

arXiv:2508.02175·cs.SD·November 19, 2025

Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment through Latent Acoustic Pattern Triggers

Liang Lin, Miao Yu, Kaiwen Luo, Yibo Zhang, Lilan Peng, Dexian Wang, Xuehai Tang, Yuanhe Zhang, Xikang Yang, Zhenhong Zhou, Kun Wang, Yang Liu

PDF

1 Video

TL;DR

This paper uncovers vulnerabilities in Audio Large Language Models by introducing a novel backdoor attack framework that exploits subtle acoustic patterns, demonstrating significant risks and challenges in ensuring audio model safety.

Contribution

The paper presents HIN, a new backdoor attack method targeting ALLMs through acoustic triggers, and introduces the AudioSafe benchmark for evaluating model robustness against such attacks.

Findings

01

Over 90% attack success rate using environmental noise and speech rate triggers.

02

ALLMs show minimal response to volume-based triggers.

03

Poisoned samples cause only marginal changes in loss curves, indicating stealthiness.

Abstract

As Audio Large Language Models (ALLMs) emerge as powerful tools for speech processing, their safety implications demand urgent attention. While considerable research has explored textual and vision safety, audio's distinct characteristics present significant challenges. This paper first investigates: Is ALLM vulnerable to backdoor attacks exploiting acoustic triggers? In response to this issue, we introduce Hidden in the Noise (HIN), a novel backdoor attack framework designed to exploit subtle, audio-specific features. HIN applies acoustic modifications to raw audio waveforms, such as alterations to temporal dynamics and strategic injection of spectrally tailored noise. These changes introduce consistent patterns that an ALLM's acoustic feature encoder captures, embedding robust triggers within the audio stream. To evaluate ALLM robustness against audio-feature-based triggers, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment Through Latent Acoustic Pattern Triggers· underline