Representation Learning to Classify and Detect Adversarial Attacks   against Speaker and Speech Recognition Systems

Jes\'us Villalba; Sonal Joshi; Piotr \.Zelasko; and Najim Dehak

arXiv:2107.04448·eess.AS·July 12, 2021·Interspeech

Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems

Jes\'us Villalba, Sonal Joshi, Piotr \.Zelasko, and Najim Dehak

PDF

TL;DR

This paper explores using representation learning to classify and detect adversarial attacks in speech and speaker recognition systems, achieving high accuracy on known attacks and promising results on unknown attack detection.

Contribution

It introduces a method for classifying and detecting adversarial attacks in audio systems using representation learning, with insights into generalization and unknown attack detection.

Findings

01

Achieved up to 90% accuracy in classifying common attacks

02

Representations trained for speaker attack classification also work for verification tasks

03

Detected unknown attacks with about 19% equal error rate

Abstract

Adversarial attacks have become a major threat for machine learning applications. There is a growing interest in studying these attacks in the audio domain, e.g, speech and speaker recognition; and find defenses against them. In this work, we focus on using representation learning to classify/detect attacks w.r.t. the attack algorithm, threat model or signal-to-adversarial-noise ratio. We found that common attacks in the literature can be classified with accuracies as high as 90%. Also, representations trained to classify attacks against speaker identification can be used also to classify attacks against speaker verification and speech recognition. We also tested an attack verification task, where we need to decide whether two speech utterances contain the same attack. We observed that our models did not generalize well to attack algorithms not included in the attack representation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.