There is more than one kind of robustness: Fooling Whisper with   adversarial examples

Raphael Olivier; Bhiksha Raj

arXiv:2210.17316·eess.AS·August 14, 2023

There is more than one kind of robustness: Fooling Whisper with adversarial examples

Raphael Olivier, Bhiksha Raj

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that despite Whisper's robustness to noise and out-of-distribution inputs, it is vulnerable to adversarial examples that can significantly degrade performance or alter transcriptions, highlighting security concerns.

Contribution

The study reveals that Whisper's robustness does not extend to adversarial noise and introduces methods to generate such examples, exposing critical security vulnerabilities.

Findings

01

Adversarial noise can drastically reduce Whisper's accuracy.

02

Targeted transcriptions can be achieved with small perturbations.

03

Multilingual models' performance can be compromised by fooling language detectors.

Abstract

Whisper is a recent Automatic Speech Recognition (ASR) model displaying impressive robustness to both out-of-distribution inputs and random noise. In this work, we show that this robustness does not carry over to adversarial noise. We show that we can degrade Whisper performance dramatically, or even transcribe a target sentence of our choice, by generating very small input perturbations with Signal Noise Ratio of 35-45dB. We also show that by fooling the Whisper language detector we can very easily degrade the performance of multilingual models. These vulnerabilities of a widely popular open-source model have practical security implications and emphasize the need for adversarially robust ASR.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

raphaelolivier/whisper_attack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Speech Recognition and Synthesis