CIEC: Coupling Implicit and Explicit Cues for Multimodal Weakly Supervised Manipulation Localization
Xinquan Yu, Wei Lu, Xiangyang Luo, Rui Yang

TL;DR
This paper introduces CIEC, a weakly-supervised framework for multimodal manipulation localization in images and text, using only coarse annotations and combining visual and textual cues to accurately identify manipulated regions.
Contribution
The novel CIEC framework couples implicit and explicit cues for multimodal weakly-supervised manipulation localization, reducing reliance on detailed annotations and improving localization accuracy.
Findings
Achieves comparable results to fully supervised methods.
Effective in suppressing irrelevant background interference.
Utilizes novel modules TRPS and VCTG for cue integration and noise mitigation.
Abstract
To mitigate the threat of misinformation, multimodal manipulation localization has garnered growing attention. Consider that current methods rely on costly and time-consuming fine-grained annotations, such as patch/token-level annotations. This paper proposes a novel framework named Coupling Implicit and Explicit Cues (CIEC), which aims to achieve multimodal weakly-supervised manipulation localization for image-text pairs utilizing only coarse-grained image/sentence-level annotations. It comprises two branches, image-based and text-based weakly-supervised localization. For the former, we devise the Textual-guidance Refine Patch Selection (TRPS) module. It integrates forgery cues from both visual and textual perspectives to lock onto suspicious regions aided by spatial priors. Followed by the background silencing and spatial contrast constraints to suppress interference from irrelevant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Misinformation and Its Impacts
