SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context   Misinformation Detection

Peng Qi; Zehong Yan; Wynne Hsu; Mong Li Lee

arXiv:2403.03170·cs.MM·March 12, 2024·2 cites

SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection

Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee

PDF

Open Access 1 Models

TL;DR

SNIFFER is a multimodal large language model designed to detect and explain out-of-context misinformation by assessing image-text consistency and leveraging external knowledge, significantly improving accuracy and interpretability.

Contribution

The paper introduces SNIFFER, a novel two-stage instruction-tuned multimodal model that enhances out-of-context misinformation detection and explanation capabilities.

Findings

01

SNIFFER surpasses original MLLM by over 40% in detection accuracy.

02

It outperforms state-of-the-art methods in misinformation detection.

03

Provides accurate and persuasive explanations validated by evaluations.

Abstract

Misinformation is a prevalent societal issue due to its potential high risks. Out-of-context (OOC) misinformation, where authentic images are repurposed with false text, is one of the easiest and most effective ways to mislead audiences. Current methods focus on assessing image-text consistency but lack convincing explanations for their judgments, which is essential for debunking misinformation. While Multimodal Large Language Models (MLLMs) have rich knowledge and innate capability for visual reasoning and explanation generation, they still lack sophistication in understanding and discovering the subtle crossmodal differences. In this paper, we introduce SNIFFER, a novel multimodal large language model specifically engineered for OOC misinformation detection and explanation. SNIFFER employs two-stage instruction tuning on InstructBLIP. The first stage refines the model's concept…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
MischaQI/SNIFFER
model· ♡ 7
♡ 7

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Sentiment Analysis and Opinion Mining · Topic Modeling

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Multi-Head Attention · Layer Normalization · Dropout · Softmax · Dense Connections · Label Smoothing · Adam