ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms   using Linguistic Features

Peng Cheng; Yuwei Wang; Peng Huang; Zhongjie Ba; Xiaodong Lin; Feng; Lin; Li Lu; Kui Ren

arXiv:2408.01808·cs.CR·August 6, 2024

ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features

Peng Cheng, Yuwei Wang, Peng Huang, Zhongjie Ba, Xiaodong Lin, Feng, Lin, Li Lu, Kui Ren

PDF

Open Access 1 Repo

TL;DR

This paper introduces ALIF, a novel low-cost black-box adversarial attack method on speech recognition systems that uses linguistic features and leverages the reciprocal TTS-ASR process to generate effective adversarial audio with minimal queries.

Contribution

ALIF is the first attack pipeline that constructs adversarial examples directly around the decision boundary using linguistic features, significantly reducing query costs and increasing robustness against ASR updates.

Findings

01

ALIF-OTL reduces query count by 97.7%.

02

ALIF-OTA reduces query count by 73.3%.

03

ALIF can generate attacks with only one query.

Abstract

Extensive research has revealed that adversarial examples (AE) pose a significant threat to voice-controllable smart devices. Recent studies have proposed black-box adversarial attacks that require only the final transcription from an automatic speech recognition (ASR) system. However, these attacks typically involve many queries to the ASR, resulting in substantial costs. Moreover, AE-based adversarial audio samples are susceptible to ASR updates. In this paper, we identify the root cause of these limitations, namely the inability to construct AE attack samples directly around the decision boundary of deep learning (DL) models. Building on this observation, we propose ALIF, the first black-box adversarial linguistic feature-based attack pipeline. We leverage the reciprocal process of text-to-speech (TTS) and ASR models to generate perturbations in the linguistic embedding space where…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TASER2023/TASER
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Adversarial Robustness in Machine Learning

MethodsAutoencoders