SyntheticPop: Attacking Speaker Verification Systems With Synthetic   VoicePops

Eshaq Jamdar; Amith Kamath Belman

arXiv:2502.09553·cs.CR·February 14, 2025

SyntheticPop: Attacking Speaker Verification Systems With Synthetic VoicePops

Eshaq Jamdar, Amith Kamath Belman

PDF

Open Access

TL;DR

This paper introduces SyntheticPop, a novel attack that embeds synthetic noises into spoofed audio to significantly undermine voice verification systems enhanced with VoicePop, revealing their vulnerability to such adversarial manipulations.

Contribution

We propose SyntheticPop, an effective attack method that degrades VA+VoicePop performance by embedding synthetic noises, highlighting the need for more robust defenses against such attacks.

Findings

01

SyntheticPop achieves over 95% attack success rate.

02

VA+VoicePop accuracy drops to 14% under SyntheticPop attack.

03

Baseline label flipping reduces accuracy to 37%.

Abstract

Voice Authentication (VA), also known as Automatic Speaker Verification (ASV), is a widely adopted authentication method, particularly in automated systems like banking services, where it serves as a secondary layer of user authentication. Despite its popularity, VA systems are vulnerable to various attacks, including replay, impersonation, and the emerging threat of deepfake audio that mimics the voice of legitimate users. To mitigate these risks, several defense mechanisms have been proposed. One such solution, Voice Pops, aims to distinguish an individual's unique phoneme pronunciations during the enrollment process. While promising, the effectiveness of VA+VoicePop against a broader range of attacks, particularly logical or adversarial attacks, remains insufficiently explored. We propose a novel attack method, which we refer to as SyntheticPop, designed to target the phoneme…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders