Fooling End-to-end Speaker Verification by Adversarial Examples

Felix Kreuk; Yossi Adi; Moustapha Cisse; Joseph Keshet

arXiv:1801.03339·cs.LG·February 19, 2018

Fooling End-to-end Speaker Verification by Adversarial Examples

Felix Kreuk, Yossi Adi, Moustapha Cisse, Joseph Keshet

PDF

TL;DR

This paper demonstrates that end-to-end speaker verification systems are vulnerable to adversarial examples, which can fool the system into misidentifying speakers without perceptible differences to humans.

Contribution

It introduces both white-box and black-box adversarial attack methods on end-to-end speaker verification models, highlighting their susceptibility to such attacks.

Findings

01

Adversarial examples significantly reduce system accuracy.

02

False-positive rates increase dramatically under attack.

03

White-box attacks are highly effective across datasets.

Abstract

Automatic speaker verification systems are increasingly used as the primary means to authenticate costumers. Recently, it has been proposed to train speaker verification systems using end-to-end deep neural models. In this paper, we show that such systems are vulnerable to adversarial example attack. Adversarial examples are generated by adding a peculiar noise to original speaker examples, in such a way that they are almost indistinguishable from the original examples by a human listener. Yet, the generated waveforms, which sound as speaker A can be used to fool such a system by claiming as if the waveforms were uttered by speaker B. We present white-box attacks on an end-to-end deep network that was either trained on YOHO or NTIMIT. We also present two black-box attacks: where the adversarial examples were generated with a system that was trained on YOHO, but the attack is on a system…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.