Attacking Speaker Recognition With Deep Generative Models

Wilson Cai; Anish Doshi; Rafael Valle

arXiv:1801.02384·cs.SD·January 9, 2018·20 cites

Attacking Speaker Recognition With Deep Generative Models

Wilson Cai, Anish Doshi, Rafael Valle

PDF

Open Access

TL;DR

This paper explores the use of deep generative models, specifically GANs, to create spoofing attacks on speaker recognition systems, highlighting security vulnerabilities and proposing a semi-supervised attack method.

Contribution

It introduces a modified Wasserstein GAN for semi-supervised attack generation, capable of both targeted and untargeted spoofing on speaker recognition systems.

Findings

01

Samples from SampleRNN and WaveNet do not fool CNN-based systems

02

Modified Wasserstein GAN enables effective spoofing attacks

03

Raises security concerns in speaker authentication systems

Abstract

In this paper we investigate the ability of generative adversarial networks (GANs) to synthesize spoofing attacks on modern speaker recognition systems. We first show that samples generated with SampleRNN and WaveNet are unable to fool a CNN-based speaker recognition system. We propose a modification of the Wasserstein GAN objective function to make use of data that is real but not from the class being learned. Our semi-supervised learning method is able to perform both targeted and untargeted attacks, raising questions related to security in speaker authentication systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Adversarial Robustness in Machine Learning · Speech Recognition and Synthesis

MethodsMixture of Logistic Distributions · Convolution · Dilated Causal Convolution · WaveNet · Dogecoin Customer Service Number +1-833-534-1729