Unlocking the Forgery Detection Potential of Vanilla MLLMs: A Novel Training-Free Pipeline

Rui Zuo; Qinyue Tong; Zhe-Ming Lu; Ziqian Lu

arXiv:2511.13442·cs.CV·November 19, 2025

Unlocking the Forgery Detection Potential of Vanilla MLLMs: A Novel Training-Free Pipeline

Rui Zuo, Qinyue Tong, Zhe-Ming Lu, Ziqian Lu

PDF

Open Access

TL;DR

This paper introduces Foresee, a training-free pipeline leveraging vanilla multimodal large language models for image forgery detection, achieving superior localization and explanation capabilities without additional training.

Contribution

It presents a novel training-free approach that enhances vanilla MLLMs for image forgery detection, surpassing existing methods in accuracy, interpretability, and generalization.

Findings

01

Outperforms existing methods in tamper localization accuracy

02

Provides richer textual explanations for forgeries

03

Demonstrates strong generalization across diverse tampering types

Abstract

With the rapid advancement of artificial intelligence-generated content (AIGC) technologies, including multimodal large language models (MLLMs) and diffusion models, image generation and manipulation have become remarkably effortless. Existing image forgery detection and localization (IFDL) methods often struggle to generalize across diverse datasets and offer limited interpretability. Nowadays, MLLMs demonstrate strong generalization potential across diverse vision-language tasks, and some studies introduce this capability to IFDL via large-scale training. However, such approaches cost considerable computational resources, while failing to reveal the inherent generalization potential of vanilla MLLMs to address this problem. Inspired by this observation, we propose Foresee, a training-free MLLM-based pipeline tailored for image forgery analysis. It eliminates the need for additional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Authorship Attribution and Profiling