Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding
Lea Sch\"onherr, Katharina Kohls, Steffen Zeiler, Thorsten Holz,, Dorothea Kolossa

TL;DR
This paper presents a novel psychoacoustic hiding attack on speech recognition systems, embedding malicious commands into audio signals that are imperceptible to humans but effectively transcribed by ASR systems, demonstrating high success rates.
Contribution
It introduces a new adversarial attack method exploiting psychoacoustic models and backpropagation to generate imperceptible malicious audio for ASR systems, advancing security evaluation techniques.
Findings
Achieved up to 98% attack success rate
Perturbations remain inaudible to human listeners
Attack requires less than two minutes for a ten-second audio sample
Abstract
Voice interfaces are becoming accepted widely as input methods for a diverse set of devices. This development is driven by rapid improvements in automatic speech recognition (ASR), which now performs on par with human listening in many tasks. These improvements base on an ongoing evolution of DNNs as the computational core of ASR. However, recent research results show that DNNs are vulnerable to adversarial perturbations, which allow attackers to force the transcription into a malicious output. In this paper, we introduce a new type of adversarial examples based on psychoacoustic hiding. Our attack exploits the characteristics of DNN-based ASR systems, where we extend the original analysis procedure by an additional backpropagation step. We use this backpropagation to learn the degrees of freedom for the adversarial perturbation of the input signal, i.e., we apply a psychoacoustic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
