A Comparative Study on Proactive and Passive Detection of Deepfake Speech

Chia-Hua Wu; Wanying Ge; Xin Wang; Junichi Yamagishi; Yu Tsao; Hsin-Min Wang

arXiv:2506.14398·cs.SD·June 18, 2025

A Comparative Study on Proactive and Passive Detection of Deepfake Speech

Chia-Hua Wu, Wanying Ge, Xin Wang, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang

PDF

Open Access 1 Repo

TL;DR

This paper compares proactive watermarking and passive deepfake speech detection methods using a unified framework, evaluating their performance and robustness on common datasets and attacks.

Contribution

It introduces a standardized evaluation framework for both detection types, enabling fair comparison and analysis of their vulnerabilities.

Findings

01

Different models show distinct vulnerabilities to speech attribute distortions.

02

All models were trained and tested on common datasets for fair comparison.

03

The framework facilitates joint evaluation and selection of detection methods.

Abstract

Solutions for defending against deepfake speech fall into two categories: proactive watermarking models and passive conventional deepfake detectors. While both address common threats, their differences in training, optimization, and evaluation prevent a unified protocol for joint evaluation and selecting the best solutions for different cases. This work proposes a framework to evaluate both model types in deepfake speech detection. To ensure fair comparison and minimize discrepancies, all models were trained and tested on common datasets, with performance evaluated using a shared metric. We also analyze their robustness against various adversarial attacks, showing that different models exhibit distinct vulnerabilities to different speech attribute distortions. Our training and evaluation code is available at Github.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nii-yamagishilab/antispoofing-watermark
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing