Perturbed Public Voices (P$^{2}$V): A Dataset for Robust Audio Deepfake Detection

Chongyang Gao; Marco Postiglione; Isabel Gortner; Sarit Kraus; V.S. Subrahmanian

arXiv:2508.10949·cs.SD·August 18, 2025

Perturbed Public Voices (P$^{2}$V): A Dataset for Robust Audio Deepfake Detection

Chongyang Gao, Marco Postiglione, Isabel Gortner, Sarit Kraus, V.S. Subrahmanian

PDF

TL;DR

This paper introduces P$^{2}$V, a comprehensive dataset for evaluating and improving the robustness of audio deepfake detectors against real-world challenges like noise, adversarial attacks, and advanced voice cloning techniques.

Contribution

The paper presents P$^{2}$V, a novel dataset that captures realistic deepfake scenarios, and demonstrates its effectiveness as a benchmark for developing more robust audio deepfake detection models.

Findings

01

Current detectors lose 43% performance on P$^{2}$V

02

Adversarial perturbations cause up to 16% degradation

03

Models trained on P$^{2}$V maintain robustness and generalize well

Abstract

Current audio deepfake detectors cannot be trusted. While they excel on controlled benchmarks, they fail when tested in the real world. We introduce Perturbed Public Voices (P $^{2}$ V), an IRB-approved dataset capturing three critical aspects of malicious deepfakes: (1) identity-consistent transcripts via LLMs, (2) environmental and adversarial noise, and (3) state-of-the-art voice cloning (2020-2025). Experiments reveal alarming vulnerabilities of 22 recent audio deepfake detectors: models trained on current datasets lose 43% performance when tested on P $^{2}$ V, with performance measured as the mean of F1 score on deepfake audio, AUC, and 1-EER. Simple adversarial perturbations induce up to 16% performance degradation, while advanced cloning techniques reduce detectability by 20-30%. In contrast, P $^{2}$ V-trained models maintain robustness against these attacks while generalizing to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.