Multilingual Source Tracing of Speech Deepfakes: A First Benchmark

Xi Xuan; Yang Xiao; Rohan Kumar Das; Tomi Kinnunen

arXiv:2508.04143·eess.AS·August 7, 2025

Multilingual Source Tracing of Speech Deepfakes: A First Benchmark

Xi Xuan, Yang Xiao, Rohan Kumar Das, Tomi Kinnunen

PDF

2 Datasets

TL;DR

This paper presents the first benchmark for tracing the source models of multilingual speech deepfakes, analyzing cross-lingual generalization and the impact of different modeling approaches.

Contribution

It introduces a comprehensive benchmark for multilingual speech deepfake source tracing, including dataset, protocol, and analysis of modeling techniques.

Findings

01

SSL representations improve cross-lingual generalization

02

Model identification is challenging across unseen languages

03

Fine-tuning impacts source tracing performance

Abstract

Recent progress in generative AI has made it increasingly easy to create natural-sounding deepfake speech from just a few seconds of audio. While these tools support helpful applications, they also raise serious concerns by making it possible to generate convincing fake speech in many languages. Current research has largely focused on detecting fake speech, but little attention has been given to tracing the source models used to generate it. This paper introduces the first benchmark for multilingual speech deepfake source tracing, covering both mono- and cross-lingual scenarios. We comparatively investigate DSP- and SSL-based modeling; examine how SSL representations fine-tuned on different languages impact cross-lingual generalization performance; and evaluate generalization to unseen languages and speakers. Our findings offer the first comprehensive insights into the challenges of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.