SAPL: Semantic-Agnostic Prompt Learning in CLIP for Weakly Supervised Image Manipulation Localization

Xinghao Wang; Changtao Miao; Dianmo Sheng; Tao Gong; Qi Chu; Nenghai Yu; Quanchen Zou; Deyue Zhang; Xiangzheng Zhang

arXiv:2601.06222·cs.CV·January 13, 2026

SAPL: Semantic-Agnostic Prompt Learning in CLIP for Weakly Supervised Image Manipulation Localization

Xinghao Wang, Changtao Miao, Dianmo Sheng, Tao Gong, Qi Chu, Nenghai Yu, Quanchen Zou, Deyue Zhang, Xiangzheng Zhang

PDF

Open Access

TL;DR

SAPL introduces a novel weakly supervised method for image manipulation localization that emphasizes boundary cues over semantics, leveraging CLIP with edge-focused prompt learning and contrastive techniques.

Contribution

The paper proposes SAPL, a boundary-centric prompt learning framework in CLIP that improves manipulation localization without pixel-level annotations.

Findings

01

SAPL outperforms existing methods on multiple benchmarks.

02

Edge-aware modules significantly enhance localization accuracy.

03

Boundary-focused cues improve manipulation detection in weakly supervised settings.

Abstract

Malicious image manipulation threatens public safety and requires efficient localization methods. Existing approaches depend on costly pixel-level annotations which make training expensive. Existing weakly supervised methods rely only on image-level binary labels and focus on global classification, often overlooking local edge cues that are critical for precise localization. We observe that feature variations at manipulated boundaries are substantially larger than in interior regions. To address this gap, we propose Semantic-Agnostic Prompt Learning (SAPL) in CLIP, which learns text prompts that intentionally encode non-semantic, boundary-centric cues so that CLIPs multimodal similarity highlights manipulation edges rather than high-level object semantics. SAPL combines two complementary modules Edge-aware Contextual Prompt Learning (ECPL) and Hierarchical Edge Contrastive Learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis