Towards Generalized Image Manipulation Localization via Score-based Model
Yunfei Wang, Bo Du, Zhe Yang, Xin Liu, Zhiyu Lin, Tianxin Xu, and Ji-Zhe Zhou

TL;DR
This paper introduces DiffIML, a score-based generative modeling framework for image manipulation localization that improves generalization to unseen manipulation types by capturing intrinsic mask distribution geometry.
Contribution
DiffIML leverages score-based generative models with structural priors to enhance generalization in image manipulation localization, overcoming overfitting issues of discriminative methods.
Findings
DiffIML outperforms state-of-the-art methods on multiple benchmarks.
It demonstrates significant generalization to unseen datasets.
The framework is efficient and stable with proposed architectural improvements.
Abstract
With the rapid evolution of synthetic media, Image Manipulation Localization (IML) has emerged as a critical component in multimedia forensics for ensuring the integrity of digital content. However, generalization remains a core challenge, as existing discriminative methods typically learn a fixed decision boundary that tends to overfit to specific training artifacts and fails to adapt to unseen manipulation types. To address this, we propose DiffIML, a novel framework that introduces score-based generative modeling to IML. Diverging from the direct estimation of hard boundaries, DiffIML approximates the score function, the gradient of the log-likelihood, to capture the intrinsic geometric topology of mask distributions. This paradigm leverages structural priors to iteratively recover coherent masks from noise, thereby circumventing the brittleness associated with discriminative models.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
