Interpretable Multimodal Out-of-context Detection with Soft Logic   Regularization

Huanhuan Ma; Jinghao Zhang; Qiang Liu; Shu Wu; Liang Wang

arXiv:2406.04756·cs.CV·June 10, 2024

Interpretable Multimodal Out-of-context Detection with Soft Logic Regularization

Huanhuan Ma, Jinghao Zhang, Qiang Liu, Shu Wu, Liang Wang

PDF

TL;DR

This paper introduces LOGRAN, a logic regularization method for detecting out-of-context misinformation in images and captions, providing interpretable phrase-level explanations and demonstrating competitive results on the NewsCLIPpings dataset.

Contribution

The paper presents a novel logic regularization approach that decomposes out-of-context detection at the phrase level, enhancing interpretability and explanation of the results.

Findings

01

Competitive performance on NewsCLIPpings dataset

02

Faithful phrase-level out-of-context predictions

03

Enhanced interpretability with logical explanations

Abstract

The rapid spread of information through mobile devices and media has led to the widespread of false or deceptive news, causing significant concerns in society. Among different types of misinformation, image repurposing, also known as out-of-context misinformation, remains highly prevalent and effective. However, current approaches for detecting out-of-context misinformation often lack interpretability and offer limited explanations. In this study, we propose a logic regularization approach for out-of-context detection called LOGRAN (LOGic Regularization for out-of-context ANalysis). The primary objective of LOGRAN is to decompose the out-of-context detection at the phrase level. By employing latent variables for phrase-level predictions, the final prediction of the image-caption pair can be aggregated using logical rules. The latent variables also provide an explanation for how the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.