Propose and Rectify: A Forensics-Driven MLLM Framework for Image Manipulation Localization

Keyang Zhang; Chenqi Kong; Hui Liu; Bo Ding; Xinghao Jiang; Haoliang Li

arXiv:2508.17976·cs.CV·August 26, 2025

Propose and Rectify: A Forensics-Driven MLLM Framework for Image Manipulation Localization

Keyang Zhang, Chenqi Kong, Hui Liu, Bo Ding, Xinghao Jiang, Haoliang Li

PDF

TL;DR

This paper introduces a novel forensic framework combining semantic reasoning and forensic analysis to improve the detection and localization of manipulated regions in images, achieving state-of-the-art results.

Contribution

It proposes a Propose-Rectify framework that integrates a forensic-adapted LLaVA model with forensic feature analysis and enhanced segmentation to improve manipulation localization.

Findings

01

Achieves state-of-the-art detection accuracy.

02

Demonstrates robustness across diverse datasets.

03

Effectively combines semantic reasoning with forensic evidence.

Abstract

The increasing sophistication of image manipulation techniques demands robust forensic solutions that can both reliably detect alterations and precisely localize tampered regions. Recent Multimodal Large Language Models (MLLMs) show promise by leveraging world knowledge and semantic understanding for context-aware detection, yet they struggle with perceiving subtle, low-level forensic artifacts crucial for accurate manipulation localization. This paper presents a novel Propose-Rectify framework that effectively bridges semantic reasoning with forensic-specific analysis. In the proposal stage, our approach utilizes a forensic-adapted LLaVA model to generate initial manipulation analysis and preliminary localization of suspicious regions based on semantic understanding and contextual reasoning. In the rectification stage, we introduce a Forensics Rectification Module that systematically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.