Muting Whisper: A Universal Acoustic Adversarial Attack on Speech   Foundation Models

Vyas Raina; Rao Ma; Charles McGhee; Kate Knill; Mark Gales

arXiv:2405.06134·cs.CL·July 18, 2024

Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models

Vyas Raina, Rao Ma, Charles McGhee, Kate Knill, Mark Gales

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper reveals a vulnerability in Whisper speech models where a universal adversarial audio segment can mute the model's output, highlighting security risks and potential protective uses.

Contribution

The authors introduce a universal adversarial audio method that effectively 'muting' Whisper models, demonstrating a new type of acoustic attack on speech foundation models.

Findings

01

A 0.64-second adversarial segment can mute over 97% of speech samples.

02

The adversarial segment transfers across datasets and tasks.

03

The attack poses both security risks and potential privacy benefits.

Abstract

Recent developments in large speech foundation models like Whisper have led to their widespread use in many automatic speech recognition (ASR) applications. These systems incorporate `special tokens' in their vocabulary, such as $<|endoftext|>$ , to guide their language generation process. However, we demonstrate that these tokens can be exploited by adversarial attacks to manipulate the model's behavior. We propose a simple yet effective method to learn a universal acoustic realization of Whisper's $<|endoftext|>$ token, which, when prepended to any speech signal, encourages the model to ignore the speech and only transcribe the special token, effectively `muting' the model. Our experiments demonstrate that the same, universal 0.64-second adversarial audio segment can successfully mute a target Whisper ASR model for over 97\% of speech samples. Moreover, we find that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rainavyas/prepend_acoustic_attack
pytorchOfficial

Videos

Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models· underline

Taxonomy

TopicsAdversarial Robustness in Machine Learning