ManipShield: A Unified Framework for Image Manipulation Detection, Localization and Explanation
Zitong Xu, Huiyu Duan, Xiaoyu Wang, Zhaolin Cai, Kaiwei Zhang, Qiang Hu, Jing Liu, Xiongkuo Min, Guangtao Zhai

TL;DR
ManipShield is a comprehensive framework utilizing a large-scale benchmark and multimodal large language models to improve detection, localization, and explanation of diverse, AI-generated image manipulations, enhancing generalization and interpretability.
Contribution
The paper introduces ManipBench, a large-scale benchmark for diverse image manipulations, and ManipShield, a unified model that achieves state-of-the-art detection, localization, and explanation performance.
Findings
ManipShield outperforms existing methods on ManipBench and public datasets.
ManipShield generalizes well to unseen manipulation models.
ManipBench provides extensive annotated data for interpretability.
Abstract
With the rapid advancement of generative models, powerful image editing methods now enable diverse and highly realistic image manipulations that far surpass traditional deepfake techniques, posing new challenges for manipulation detection. Existing image manipulation detection and localization (IMDL) benchmarks suffer from limited content diversity, narrow generative-model coverage, and insufficient interpretability, which hinders the generalization and explanation capabilities of current manipulation detection methods. To address these limitations, we introduce \textbf{ManipBench}, a large-scale benchmark for image manipulation detection and localization focusing on AI-edited images. ManipBench contains over 450K manipulated images produced by 25 state-of-the-art image editing models across 12 manipulation categories, among which 100K images are further annotated with bounding boxes,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Cell Image Analysis Techniques
