FantasyID: A dataset for detecting digital manipulations of ID-documents

Pavel Korshunov; Amir Mohammadi; Vidit Vidit; Christophe Ecabert; S\'ebastien Marcel

arXiv:2507.20808·cs.CV·July 29, 2025

FantasyID: A dataset for detecting digital manipulations of ID-documents

Pavel Korshunov, Amir Mohammadi, Vidit Vidit, Christophe Ecabert, S\'ebastien Marcel

PDF

TL;DR

FantasyID is a new, publicly available dataset designed to improve the detection of forged ID documents, challenging current algorithms with realistic, diverse, and complex manipulations to advance KYC security.

Contribution

The paper introduces FantasyID, a comprehensive dataset for ID forgery detection, including real-world-like IDs and simulated attacks, to facilitate development of more robust detection algorithms.

Findings

01

Current state-of-the-art algorithms struggle with FantasyID, especially at low false positive rates.

02

The dataset reveals significant challenges in forgery detection, with false negative rates near 50%.

03

FantasyID serves as a challenging benchmark for future detection algorithm improvements.

Abstract

Advancements in image generation led to the availability of easy-to-use tools for malicious actors to create forged images. These tools pose a serious threat to the widespread Know Your Customer (KYC) applications, requiring robust systems for detection of the forged Identity Documents (IDs). To facilitate the development of the detection algorithms, in this paper, we propose a novel publicly available (including commercial use) dataset, FantasyID, which mimics real-world IDs but without tampering with legal documents and, compared to previous public datasets, it does not contain generated faces or specimen watermarks. FantasyID contains ID cards with diverse design styles, languages, and faces of real people. To simulate a realistic KYC scenario, the cards from FantasyID were printed and captured with three different devices, constituting the bonafide class. We have emulated digital…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.