Linguistic Profiling of Deepfakes: An Open Database for Next-Generation   Deepfake Detection

Yabin Wang; Zhiwu Huang; Zhiheng Ma; and Xiaopeng Hong

arXiv:2401.02335·cs.CV·January 5, 2024·1 cites

Linguistic Profiling of Deepfakes: An Open Database for Next-Generation Deepfake Detection

Yabin Wang, Zhiwu Huang, Zhiheng Ma, and Xiaopeng Hong

PDF

Open Access 1 Repo

TL;DR

This paper introduces DFLIP-3K, a comprehensive open database with 300K deepfake samples and 190K linguistic footprints, enabling advanced, explainable deepfake detection and model identification through linguistic profiling.

Contribution

The paper presents the largest deepfake database with linguistic annotations, facilitating next-generation, interpretable deepfake detection methods and establishing a benchmark for linguistic profiling.

Findings

01

DFLIP-3K contains 300K deepfake samples from 3K models.

02

The database includes 190K linguistic footprints for deepfake analysis.

03

Experiments show DFLIP-3K effectively supports linguistic-based detection and identification.

Abstract

The emergence of text-to-image generative models has revolutionized the field of deepfakes, enabling the creation of realistic and convincing visual content directly from textual descriptions. However, this advancement presents considerably greater challenges in detecting the authenticity of such content. Existing deepfake detection datasets and methods often fall short in effectively capturing the extensive range of emerging deepfakes and offering satisfactory explanatory information for detection. To address the significant issue, this paper introduces a deepfake database (DFLIP-3K) for the development of convincing and explainable deepfake detection. It encompasses about 300K diverse deepfake samples from approximately 3K generative models, which boasts the largest number of deepfake models in the literature. Moreover, it collects around 190K linguistic footprints of these deepfakes.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dflip3k/dflip-3k
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Computational and Text Analysis Methods