Codec-Robust Attacks on Audio LLMs
Jaechul Roh, Jean-Philippe Monteuuis, Jonathan Petit, Amir Houmansdar

TL;DR
CodecAttack introduces a novel method for creating robust adversarial perturbations in audio LLMs by manipulating the codec's latent space, bypassing compression defenses and achieving high attack success rates across multiple codecs and models.
Contribution
The paper presents CodecAttack, a new attack optimizing perturbations in the codec's latent space, demonstrating robustness against real-world compression and outperforming waveform-based attacks.
Findings
Achieves 85.5% attack success rate on Opus at moderate bitrates.
Transfers effectively to unseen codecs like MP3 and AAC-LC.
Latent perturbations concentrate below 4kHz, aligning with codec bit allocation.
Abstract
Prior attacks on Audio Large Language Models (Audio LLMs) demonstrated that carefully crafted waveform-domain perturbations can force targeted adversarial outputs. As a defense mechanism against these attacks, real-world codec compression preprocessing has been studied to both detect and remove the perturbations. Yet no existing attack has demonstrated robustness against these compressions. We introduce CodecAttack, which optimizes a perturbation in a neural audio codec's continuous latent space rather than directly perturbing the audio waveform. We show that the codec's compression channel, which discards waveform perturbations, transmits perturbations crafted in its own latent space. To further harden the attack across real-world compression channels, we apply multi-bitrate straight-through Expectation-over-Transformation (EoT), all without modifying the target model. Across three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
