Linguistic Profiling of Deepfakes: An Open Database for Next-Generation Deepfake Detection
Yabin Wang, Zhiwu Huang, Zhiheng Ma, and Xiaopeng Hong

TL;DR
This paper introduces DFLIP-3K, a comprehensive open database with 300K deepfake samples and 190K linguistic footprints, enabling advanced, explainable deepfake detection and model identification through linguistic profiling.
Contribution
The paper presents the largest deepfake database with linguistic annotations, facilitating next-generation, interpretable deepfake detection methods and establishing a benchmark for linguistic profiling.
Findings
DFLIP-3K contains 300K deepfake samples from 3K models.
The database includes 190K linguistic footprints for deepfake analysis.
Experiments show DFLIP-3K effectively supports linguistic-based detection and identification.
Abstract
The emergence of text-to-image generative models has revolutionized the field of deepfakes, enabling the creation of realistic and convincing visual content directly from textual descriptions. However, this advancement presents considerably greater challenges in detecting the authenticity of such content. Existing deepfake detection datasets and methods often fall short in effectively capturing the extensive range of emerging deepfakes and offering satisfactory explanatory information for detection. To address the significant issue, this paper introduces a deepfake database (DFLIP-3K) for the development of convincing and explainable deepfake detection. It encompasses about 300K diverse deepfake samples from approximately 3K generative models, which boasts the largest number of deepfake models in the literature. Moreover, it collects around 190K linguistic footprints of these deepfakes.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Computational and Text Analysis Methods
