Leveraging Contrastive Learning for a Similarity-Guided Tampered Document Data Generation Pipeline

Mohamed Dhouib; Davide Buscaldi; Sonia Vanier; Aymen Shabou

arXiv:2602.17322·cs.CV·February 20, 2026

Leveraging Contrastive Learning for a Similarity-Guided Tampered Document Data Generation Pipeline

Mohamed Dhouib, Davide Buscaldi, Sonia Vanier, Aymen Shabou

PDF

Open Access

TL;DR

This paper introduces a contrastive learning-based pipeline for generating diverse, high-quality tampered document images to improve detection models, addressing limitations of previous rule-based methods and enhancing robustness in real-world scenarios.

Contribution

The authors propose a novel contrastive learning framework with auxiliary networks to generate realistic tampered document images, significantly improving data diversity and quality for training detection models.

Findings

01

Generated datasets lead to improved detection performance.

02

The pipeline produces more realistic and varied tampered documents.

03

Models trained on generated data outperform those trained on previous methods.

Abstract

Detecting tampered text in document images is a challenging task due to data scarcity. To address this, previous work has attempted to generate tampered documents using rule-based methods. However, the resulting documents often suffer from limited variety and poor visual quality, typically leaving highly visible artifacts that are rarely observed in real-world manipulations. This undermines the model's ability to learn robust, generalizable features and results in poor performance on real-world data. Motivated by this discrepancy, we propose a novel method for generating high-quality tampered document images. We first train an auxiliary network to compare text crops, leveraging contrastive learning with a novel strategy for defining positive pairs and their corresponding negatives. We also train a second auxiliary network to evaluate whether a crop tightly encloses the intended…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Handwritten Text Recognition Techniques