Multi-Modal Semantic Inconsistency Detection in Social Media News Posts

Scott McCrae; Kehan Wang; Avideh Zakhor

arXiv:2105.12855·cs.CV·May 28, 2021·1 cites

Multi-Modal Semantic Inconsistency Detection in Social Media News Posts

Scott McCrae, Kehan Wang, Avideh Zakhor

PDF

Open Access

TL;DR

This paper presents a multi-modal framework for detecting semantic mismatches in social media news posts, combining text, audio, and video analysis to improve accuracy over uni-modal methods.

Contribution

The paper introduces a novel multi-modal fusion architecture and a new dataset for detecting semantic inconsistencies in social media videos and captions.

Findings

01

Achieves 60.5% accuracy in mismatch detection

02

Fusion across multiple modalities improves performance

03

A new dataset of 4,000 Facebook news posts was curated

Abstract

As computer-generated content and deepfakes make steady improvements, semantic approaches to multimedia forensics will become more important. In this paper, we introduce a novel classification architecture for identifying semantic inconsistencies between video appearance and text caption in social media news posts. We develop a multi-modal fusion framework to identify mismatches between videos and captions in social media posts by leveraging an ensemble method based on textual analysis of the caption, automatic audio transcription, semantic video analysis, object detection, named entity consistency, and facial verification. To train and test our approach, we curate a new video-based dataset of 4,000 real-world Facebook news posts for analysis. Our multi-modal approach achieves 60.5% classification accuracy on random mismatches between caption and appearance, compared to accuracy below…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Multimodal Machine Learning Applications · Misinformation and Its Impacts