TL;DR
This paper uncovers vulnerabilities in Audio Large Language Models by introducing a novel backdoor attack framework that exploits subtle acoustic patterns, demonstrating significant risks and challenges in ensuring audio model safety.
Contribution
The paper presents HIN, a new backdoor attack method targeting ALLMs through acoustic triggers, and introduces the AudioSafe benchmark for evaluating model robustness against such attacks.
Findings
Over 90% attack success rate using environmental noise and speech rate triggers.
ALLMs show minimal response to volume-based triggers.
Poisoned samples cause only marginal changes in loss curves, indicating stealthiness.
Abstract
As Audio Large Language Models (ALLMs) emerge as powerful tools for speech processing, their safety implications demand urgent attention. While considerable research has explored textual and vision safety, audio's distinct characteristics present significant challenges. This paper first investigates: Is ALLM vulnerable to backdoor attacks exploiting acoustic triggers? In response to this issue, we introduce Hidden in the Noise (HIN), a novel backdoor attack framework designed to exploit subtle, audio-specific features. HIN applies acoustic modifications to raw audio waveforms, such as alterations to temporal dynamics and strategic injection of spectrally tailored noise. These changes introduce consistent patterns that an ALLM's acoustic feature encoder captures, embedding robust triggers within the audio stream. To evaluate ALLM robustness against audio-feature-based triggers, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
