TwinShift: Benchmarking Audio Deepfake Detection across Synthesizer and Speaker Shifts

Jiyoung Hong; Yoonseo Chung; Seungyeon Oh; Juntae Kim; Jiyoung Lee; Sookyung Kim; Hyunsoo Cho

arXiv:2510.23096·cs.SD·October 28, 2025

TwinShift: Benchmarking Audio Deepfake Detection across Synthesizer and Speaker Shifts

Jiyoung Hong, Yoonseo Chung, Seungyeon Oh, Juntae Kim, Jiyoung Lee, Sookyung Kim, Hyunsoo Cho

PDF

TL;DR

TWINSHIFT is a new benchmark designed to evaluate the robustness of audio deepfake detectors when faced with unseen synthesis methods and speaker variations, highlighting current limitations and guiding future improvements.

Contribution

We introduce TWINSHIFT, a comprehensive benchmark for assessing audio deepfake detection robustness across unseen synthesizers and speakers, addressing a critical gap in current evaluation methods.

Findings

01

Current detectors struggle with unseen synthesis methods and speakers.

02

TWINSHIFT reveals significant robustness gaps in existing systems.

03

Benchmark provides guidance for developing more reliable detection methods.

Abstract

Audio deepfakes pose a growing threat, already exploited in fraud and misinformation. A key challenge is ensuring detectors remain robust to unseen synthesis methods and diverse speakers, since generation techniques evolve quickly. Despite strong benchmark results, current systems struggle to generalize to new conditions limiting real-world reliability. To address this, we introduce TWINSHIFT, a benchmark explicitly designed to evaluate detection robustness under strictly unseen conditions. Our benchmark is constructed from six different synthesis systems, each paired with disjoint sets of speakers, allowing for a rigorous assessment of how well detectors generalize when both the generative model and the speaker identity change. Through extensive experiments, we show that TWINSHIFT reveals important robustness gaps, uncover overlooked limitations, and provide principled guidance for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.