Toward Real-world Text Image Forgery Localization: Structured and Interpretable Data Synthesis
Zeqin Yu, Haotao Xie, Jian Zhang, Jiangqun Ni, Wenkan Su, Jiwu Huang

TL;DR
This paper introduces Fourier Series-based Tampering Synthesis (FSTS), a structured data generation framework that improves the realism and diversity of training data for text image forgery localization, enhancing model generalization to real-world cases.
Contribution
FSTS is a novel, interpretable framework that models tampering behaviors hierarchically using Fourier series principles to synthesize realistic training data for forgery localization.
Findings
Models trained with FSTS data show improved generalization on real-world datasets.
FSTS effectively captures complex tampering behaviors through hierarchical modeling.
Synthetic data generated by FSTS enhances the robustness of forgery detection models.
Abstract
Existing Text Image Forgery Localization (T-IFL) methods often suffer from poor generalization due to the limited scale of real-world datasets and the distribution gap caused by synthetic data that fails to capture the complexity of real-world tampering. To tackle this issue, we propose Fourier Series-based Tampering Synthesis (FSTS), a structured and interpretable framework for synthesizing tampered text images. FSTS first collects 16,750 real-world tampering instances from five representative tampering types, using a structured pipeline that records human-performed editing traces via multi-format logs (e.g., video, PSD, and editing logs). By analyzing these collected parameters and identifying recurring behavioral patterns at both individual and population levels, we formulate a hierarchical modeling framework. Specifically, each individual tampering parameter is represented as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Advanced Steganography and Watermarking Techniques
