Weakly-Supervised Image Forgery Localization via Vision-Language Collaborative Reasoning Framework
Ziqi Sheng, Junyan Wu, Wei Lu, Jiantao Zhou

TL;DR
ViLaCo introduces a vision-language reasoning framework that leverages pre-trained models and semantic supervision to improve weakly-supervised image forgery localization, achieving state-of-the-art results with only image-level labels.
Contribution
The paper presents ViLaCo, a novel framework that integrates vision-language models and semantic reasoning for pixel-level forgery localization under weak supervision, surpassing existing methods.
Findings
Outperforms existing WSIFL methods in localization accuracy
Achieves state-of-the-art detection performance on public datasets
Effectively utilizes semantic knowledge from pre-trained VLMs
Abstract
Image forgery localization aims to precisely identify tampered regions within images, but it commonly depends on costly pixel-level annotations. To alleviate this annotation burden, weakly supervised image forgery localization (WSIFL) has emerged, yet existing methods still achieve limited localization performance as they mainly exploit intra-image consistency clues and lack external semantic guidance to compensate for weak supervision. In this paper, we propose ViLaCo, a vision-language collaborative reasoning framework that introduces auxiliary semantic supervision distilled from pre-trained vision-language models (VLMs), enabling accurate pixel-level localization using only image-level labels. Specifically, ViLaCo first incorporates semantic knowledge through a vision-language feature modeling network, which jointly extracts textual and visual priors using pre-trained VLMs. Next, an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques
