Attacker's Noise Can Manipulate Your Audio-based LLM in the Real World
Vinu Sankar Sadasivan, Soheil Feizi, Rajiv Mathews, Lun Wang

TL;DR
This paper reveals that audio-based large language models are vulnerable to adversarial audio perturbations and background noises, which can manipulate their responses and degrade performance in real-world scenarios, posing security risks.
Contribution
It demonstrates the feasibility of crafting stealthy audio attacks on ALLMs and highlights the impact of background noise on their response quality, emphasizing real-world security concerns.
Findings
Adversarial audio can manipulate ALLMs into targeted behaviors.
Background noise significantly degrades ALLMs response quality.
Attacks are scalable and transferable to real-world scenarios.
Abstract
This paper investigates the real-world vulnerabilities of audio-based large language models (ALLMs), such as Qwen2-Audio. We first demonstrate that an adversary can craft stealthy audio perturbations to manipulate ALLMs into exhibiting specific targeted behaviors, such as eliciting responses to wake-keywords (e.g., "Hey Qwen"), or triggering harmful behaviors (e.g. "Change my calendar event"). Subsequently, we show that playing adversarial background noise during user interaction with the ALLMs can significantly degrade the response quality. Crucially, our research illustrates the scalability of these attacks to real-world scenarios, impacting other innocent users when these adversarial noises are played through the air. Further, we discuss the transferrability of the attack, and potential defensive measures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · User Authentication and Security Systems · Advanced Malware Detection Techniques
