Unlocking the Forgery Detection Potential of Vanilla MLLMs: A Novel Training-Free Pipeline
Rui Zuo, Qinyue Tong, Zhe-Ming Lu, Ziqian Lu

TL;DR
This paper introduces Foresee, a training-free pipeline leveraging vanilla multimodal large language models for image forgery detection, achieving superior localization and explanation capabilities without additional training.
Contribution
It presents a novel training-free approach that enhances vanilla MLLMs for image forgery detection, surpassing existing methods in accuracy, interpretability, and generalization.
Findings
Outperforms existing methods in tamper localization accuracy
Provides richer textual explanations for forgeries
Demonstrates strong generalization across diverse tampering types
Abstract
With the rapid advancement of artificial intelligence-generated content (AIGC) technologies, including multimodal large language models (MLLMs) and diffusion models, image generation and manipulation have become remarkably effortless. Existing image forgery detection and localization (IFDL) methods often struggle to generalize across diverse datasets and offer limited interpretability. Nowadays, MLLMs demonstrate strong generalization potential across diverse vision-language tasks, and some studies introduce this capability to IFDL via large-scale training. However, such approaches cost considerable computational resources, while failing to reveal the inherent generalization potential of vanilla MLLMs to address this problem. Inspired by this observation, we propose Foresee, a training-free MLLM-based pipeline tailored for image forgery analysis. It eliminates the need for additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Authorship Attribution and Profiling
