Evaluating Synthetic Command Attacks on Smart Voice Assistants
Zhengxian He, Ashish Kundu, Mustaque Ahamad

TL;DR
This paper investigates the security risks posed by synthetic speech commands to voice assistants, demonstrating that simple synthesis methods can successfully execute malicious commands with minimal resources, highlighting the need for improved defenses.
Contribution
It introduces the feasibility of synthetic command attacks on voice assistants and evaluates their effectiveness using simple concatenative speech synthesis techniques.
Findings
Synthetic commands can successfully trigger voice assistants to perform sensitive operations.
Attacks require minimal host and network resources, making them practical.
Current verification methods are vulnerable to synthetic speech-based impersonation.
Abstract
Recent advances in voice synthesis, coupled with the ease with which speech can be harvested for millions of people, introduce new threats to applications that are enabled by devices such as voice assistants (e.g., Amazon Alexa, Google Home etc.). We explore if unrelated and limited amount of speech from a target can be used to synthesize commands for a voice assistant like Amazon Alexa. More specifically, we investigate attacks on voice assistants with synthetic commands when they match command sources to authorized users, and applications (e.g., Alexa Skills) process commands only when their source is an authorized user with a chosen confidence level. We demonstrate that even simple concatenative speech synthesis can be used by an attacker to command voice assistants to perform sensitive operations. We also show that such attacks, when launched by exploiting compromised devices in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling
