Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Nicholas Carlini; David Wagner

arXiv:1801.01944·cs.LG·April 2, 2018

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Nicholas Carlini, David Wagner

PDF

4 Repos

TL;DR

This paper demonstrates the creation of targeted audio adversarial examples that are nearly indistinguishable from original audio but transcribe as any chosen phrase, highlighting vulnerabilities in speech recognition systems.

Contribution

The authors develop a white-box iterative attack method to generate targeted audio adversarial examples with high success rates on DeepSpeech, revealing new security challenges in speech recognition.

Findings

01

Achieved over 99.9% similarity between original and adversarial audio

02

Attacks successfully transcribed as target phrases with 100% success rate on DeepSpeech

03

Introduced a new domain for adversarial example research in audio

Abstract

We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.