MemeMQA: Multimodal Question Answering for Memes via Rationale-Based   Inferencing

Siddhant Agarwal; Shivam Sharma; Preslav Nakov; Tanmoy Chakraborty

arXiv:2405.11215·cs.CL·May 21, 2024

MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing

Siddhant Agarwal, Shivam Sharma, Preslav Nakov, Tanmoy Chakraborty

PDF

Open Access 1 Video

TL;DR

MemeMQA introduces a multimodal question-answering framework for memes that combines reasoning and explanations, supported by a new dataset and outperforming baselines in accuracy and semantic alignment.

Contribution

This paper presents MemeMQA, a novel multimodal framework with a new dataset and reasoning capabilities, advancing meme understanding and interpretability.

Findings

01

MemeMQA achieves ~18% higher answer accuracy than baselines.

02

It provides coherent explanations alongside answers.

03

The framework demonstrates robustness across diverse question sets.

Abstract

Memes have evolved as a prevalent medium for diverse communication, ranging from humour to propaganda. With the rising popularity of image-focused content, there is a growing need to explore its potential harm from different aspects. Previous studies have analyzed memes in closed settings - detecting harm, applying semantic labels, and offering natural language explanations. To extend this research, we introduce MemeMQA, a multimodal question-answering framework aiming to solicit accurate responses to structured questions while providing coherent explanations. We curate MemeMQACorpus, a new dataset featuring 1,880 questions related to 1,122 memes with corresponding answer-explanation pairs. We further propose ARSENAL, a novel two-stage multimodal framework that leverages the reasoning capabilities of LLMs to address MemeMQA. We benchmark MemeMQA using competitive baselines and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing· underline

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Misinformation and Its Impacts