MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
Nicolas M. M\"uller, Piotr Kawa, Wei Herng Choong, Edresson Casanova, Eren G\"olge, Thorsten M\"uller, Piotr Syga, Philip Sperl, Konstantin B\"ottinger

TL;DR
MLAAD is a comprehensive multi-language synthetic audio dataset designed to improve deepfake detection models, outperforming existing datasets and fostering accessible anti-spoofing technology.
Contribution
Introduction of a large-scale, multi-language synthetic audio dataset (MLAAD) with evaluation showing its effectiveness and complementarity to existing datasets.
Findings
MLAAD outperforms comparable datasets like InTheWild and FakeOrReal.
MLAAD and ASVspoof 2019 datasets each excel on different test sets.
Published dataset and models are accessible via an interactive webserver.
Abstract
This paper presents the Multi-Language Audio Anti-Spoofing Dataset (MLAAD), version 10: a dataset of synthetic audio to train and evaluate audio deepfake detection models. It features 175 Text-to-Speech (TTS) models, comprising a total of 1002.9 hours of synthetic voice in 54 different languages. To evaluate this dataset, we train three state-of-the-art deepfake detection models with MLAAD and observe that it demonstrates superior performance to comparable datasets like InTheWild and FakeOrReal when used as a training resource. Moreover, compared to the renowned ASVspoof 2019 dataset, MLAAD proves to be a complementary resource. In tests across eight datasets, MLAAD and ASVspoof 2019 alternately outperformed each other, each excelling on four datasets. By publishing the dataset and making a trained model accessible via an interactive webserver, we aim to democratize anti-spoofing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Voice and Speech Disorders · Music and Audio Processing
