Multimodal Fake News Detection via CLIP-Guided Learning
Yangming Zhou, Qichao Ying, Zhenxing Qian, Sheng Li, Xinpeng Zhang

TL;DR
This paper introduces FND-CLIP, a multimodal fake news detection framework leveraging CLIP for improved feature extraction and fusion, demonstrating superior accuracy on multiple datasets through a novel similarity-guided approach.
Contribution
The paper proposes a new multimodal fake news detection method using CLIP-guided feature extraction and an attention mechanism, enhancing detection accuracy and feature selection flexibility.
Findings
Achieves up to 6.8% accuracy improvement on Politifact dataset.
Utilizes CLIP for better multimodal feature representation.
Introduces a modality-wise attention module for adaptive feature fusion.
Abstract
Multimodal fake news detection has attracted many research interests in social forensics. Many existing approaches introduce tailored attention mechanisms to guide the fusion of unimodal features. However, how the similarity of these features is calculated and how it will affect the decision-making process in FND are still open questions. Besides, the potential of pretrained multi-modal feature learning models in fake news detection has not been well exploited. This paper proposes a FND-CLIP framework, i.e., a multimodal Fake News Detection network based on Contrastive Language-Image Pretraining (CLIP). Given a targeted multimodal news, we extract the deep representations from the image and text using a ResNet-based encoder, a BERT-based encoder and two pair-wise CLIP encoders. The multimodal feature is a concatenation of the CLIP-generated features weighted by the standardized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Topic Modeling · Data-Driven Disease Surveillance
MethodsContrastive Language-Image Pre-training
