ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features
Peng Cheng, Yuwei Wang, Peng Huang, Zhongjie Ba, Xiaodong Lin, Feng, Lin, Li Lu, Kui Ren

TL;DR
This paper introduces ALIF, a novel low-cost black-box adversarial attack method on speech recognition systems that uses linguistic features and leverages the reciprocal TTS-ASR process to generate effective adversarial audio with minimal queries.
Contribution
ALIF is the first attack pipeline that constructs adversarial examples directly around the decision boundary using linguistic features, significantly reducing query costs and increasing robustness against ASR updates.
Findings
ALIF-OTL reduces query count by 97.7%.
ALIF-OTA reduces query count by 73.3%.
ALIF can generate attacks with only one query.
Abstract
Extensive research has revealed that adversarial examples (AE) pose a significant threat to voice-controllable smart devices. Recent studies have proposed black-box adversarial attacks that require only the final transcription from an automatic speech recognition (ASR) system. However, these attacks typically involve many queries to the ASR, resulting in substantial costs. Moreover, AE-based adversarial audio samples are susceptible to ASR updates. In this paper, we identify the root cause of these limitations, namely the inability to construct AE attack samples directly around the decision boundary of deep learning (DL) models. Building on this observation, we propose ALIF, the first black-box adversarial linguistic feature-based attack pipeline. We leverage the reciprocal process of text-to-speech (TTS) and ASR models to generate perturbations in the linguistic embedding space where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Adversarial Robustness in Machine Learning
MethodsAutoencoders
