Securing Voice-driven Interfaces against Fake (Cloned) Audio Attacks

Hafiz Malik

arXiv:1902.06782·eess.AS·February 20, 2019

Securing Voice-driven Interfaces against Fake (Cloned) Audio Attacks

Hafiz Malik

PDF

Open Access

TL;DR

This paper presents a method using higher-order spectral analysis to detect fake cloned speech, addressing security threats posed by advanced voice synthesis technologies.

Contribution

It introduces a novel detection approach leveraging spectral artifacts to distinguish genuine speech from cloned audio, improving security in voice interfaces.

Findings

01

Near-perfect detection rate of cloned speech

02

Effective on multiple cloning approaches

03

Robust against different synthesis artifacts

Abstract

Voice cloning technologies have found applications in a variety of areas ranging from personalized speech interfaces to advertisement, robotics, and so on. Existing voice cloning systems are capable of learning speaker characteristics and use trained models to synthesize a person's voice from only a few audio samples. Advances in cloned speech generation technologies are capable of generating perceptually indistinguishable speech from a bona-fide speech. These advances pose new security and privacy threats to voice-driven interfaces and speech-based access control systems. The state-of-the-art speech synthesis technologies use trained or tuned generative models for cloned speech generation. Trained generative models rely on linear operations, learned weights, and excitation source for cloned speech synthesis. These systems leave characteristic artifacts in the synthesized speech.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing