Certification of Speaker Recognition Models to Additive Perturbations
Dmitrii Korzh, Elvir Karimov, Mikhail Pautov, Oleg Y. Rogov, Ivan, Oseledets

TL;DR
This paper adapts robustness certification techniques from image recognition to speaker recognition, enhancing model reliability against additive adversarial perturbations and advancing security in voice biometric systems.
Contribution
It introduces the application and improvement of randomized smoothing certification methods for speaker recognition, filling a gap in robustness certification in the audio domain.
Findings
Effective certification on VoxCeleb datasets
Improved robustness against additive perturbations
Accelerated certification research in audio domain
Abstract
Speaker recognition technology is applied to various tasks, from personal virtual assistants to secure access systems. However, the robustness of these systems against adversarial attacks, particularly to additive perturbations, remains a significant challenge. In this paper, we pioneer applying robustness certification techniques to speaker recognition, initially developed for the image domain. Our work covers this gap by transferring and improving randomized smoothing certification techniques against norm-bounded additive perturbations for classification and few-shot learning tasks to speaker recognition. We demonstrate the effectiveness of these methods on VoxCeleb 1 and 2 datasets for several models. We expect this work to improve the robustness of voice biometrics and accelerate the research of certification methods in the audio domain.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing
MethodsRandomized Smoothing
