Detecting and Grounding Multi-Modal Media Manipulation and Beyond
Rui Shao, Tianxing Wu, Jianlong Wu, Liqiang Nie, Ziwei Liu

TL;DR
This paper introduces a new problem of detecting and grounding multi-modal media manipulation, proposing a novel dataset and a hierarchical transformer model to analyze subtle cross-modal forgery traces.
Contribution
It presents the first large-scale dataset for multi-modal fake media detection and grounding, along with a novel hierarchical transformer model, HAMMER, for fine-grained manipulation reasoning across modalities.
Findings
HAMMER outperforms existing methods in manipulation detection and grounding.
HAMMER++ achieves further improvements with contrastive learning.
The dataset enables comprehensive evaluation of multi-modal manipulation detection.
Abstract
Misinformation has become a pressing issue. Fake media, in both visual and textual forms, is widespread on the web. While various deepfake detection and text fake news detection methods have been proposed, they are only designed for single-modality forgery based on binary classification, let alone analyzing and reasoning subtle forgery traces across different modalities. In this paper, we highlight a new research problem for multi-modal fake media, namely Detecting and Grounding Multi-Modal Media Manipulation (DGM^4). DGM^4 aims to not only detect the authenticity of multi-modal media, but also ground the manipulated content, which requires deeper reasoning of multi-modal media manipulation. To support a large-scale investigation, we construct the first DGM^4 dataset, where image-text pairs are manipulated by various approaches, with rich annotation of diverse manipulations. Moreover,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Advanced Malware Detection Techniques · Spam and Phishing Detection
MethodsContrastive Learning
