Compensating Removed Frequency Components: Thwarting Voice Spectrum Reduction Attacks
Shu Wang, Kun Sun, Qi Li

TL;DR
This paper introduces ACE, an acoustic compensation system designed to counter spectrum reduction attacks on ASR systems by leveraging frequency dependencies and over-the-air perturbations, significantly reducing attack success.
Contribution
The paper proposes a novel acoustic compensation method that mitigates spectrum reduction attacks on ASR by exploiting frequency dependencies and modeling acoustic propagation effects.
Findings
ACE reduces up to 87.9% of ASR errors caused by spectrum reduction attacks.
The system leverages frequency component dependencies for effective compensation.
Analysis identifies six error types and potential mitigation strategies.
Abstract
Automatic speech recognition (ASR) provides diverse audio-to-text services for humans to communicate with machines. However, recent research reveals ASR systems are vulnerable to various malicious audio attacks. In particular, by removing the non-essential frequency components, a new spectrum reduction attack can generate adversarial audios that can be perceived by humans but cannot be correctly interpreted by ASR systems. It raises a new challenge for content moderation solutions to detect harmful content in audio and video available on social media platforms. In this paper, we propose an acoustic compensation system named ACE to counter the spectrum reduction attacks over ASR systems. Our system design is based on two observations, namely, frequency component dependencies and perturbation sensitivity. First, since the Discrete Fourier Transform computation inevitably introduces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
