LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating   Disinformation Generation

Hieu-Thi Luong; Haoyang Li; Lin Zhang; Kong Aik Lee; Eng Siong Chng

arXiv:2409.14743·eess.AS·January 7, 2025

LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation

Hieu-Thi Luong, Haoyang Li, Lin Zhang, Kong Aik Lee, Eng Siong Chng

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces LlamaPartialSpoof, a comprehensive fake speech dataset created with LLMs and voice cloning to evaluate and improve countermeasure systems against diverse disinformation tactics.

Contribution

It presents a new large-scale dataset with both fully and partially fake speech, highlighting vulnerabilities in current detection systems and emphasizing the need for more robust defenses.

Findings

01

Current detection systems have high error rates, with the best at 24.49% EER.

02

Vulnerabilities include biases toward specific TTS models and concatenation methods.

03

Detection systems struggle to generalize to unseen fake speech scenarios.

Abstract

Previous fake speech datasets were constructed from a defender's perspective to develop countermeasure (CM) systems without considering diverse motivations of attackers. To better align with real-life scenarios, we created LlamaPartialSpoof, a 130-hour dataset that contains both fully and partially fake speech, using a large language model (LLM) and voice cloning technologies to evaluate the robustness of CMs. By examining valuable information for both attackers and defenders, we identify several key vulnerabilities in current CM systems, which can be exploited to enhance attack success rates, including biases toward certain text-to-speech models or concatenation methods. Our experimental results indicate that the current fake speech detection system struggle to generalize to unseen scenarios, achieving a best performance of 24.49% equal error rate.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hieuthi/llamapartialspoof
noneOfficial

Datasets

HaoY0001/LlamaPartialSpoof
dataset· 43 dl
43 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques

MethodsALIGN