Think Twice Before You Judge: Mixture of Dual Reasoning Experts for Multimodal Sarcasm Detection

Soumyadeep Jana; Abhrajyoti Kundu; Sanasam Ranbir Singh

arXiv:2507.04458·cs.CL·October 30, 2025

Think Twice Before You Judge: Mixture of Dual Reasoning Experts for Multimodal Sarcasm Detection

Soumyadeep Jana, Abhrajyoti Kundu, Sanasam Ranbir Singh

PDF

Open Access

TL;DR

This paper introduces MiDRE, a novel multimodal sarcasm detection model that combines internal and external reasoning experts with adaptive gating, leveraging structured rationales to improve understanding of sarcasm in image-text social media posts.

Contribution

MiDRE is the first model to integrate internal and external reasoning experts with adaptive gating for multimodal sarcasm detection, utilizing structured rationales from large vision-language models.

Findings

01

MiDRE outperforms baseline models on benchmark datasets.

02

External rationales significantly improve sarcasm detection accuracy.

03

The adaptive gating mechanism effectively balances internal and external reasoning.

Abstract

Multimodal sarcasm detection has attracted growing interest due to the rise of multimedia posts on social media. Understanding sarcastic image-text posts often requires external contextual knowledge, such as cultural references or commonsense reasoning. However, existing models struggle to capture the deeper rationale behind sarcasm, relying mainly on shallow cues like image captions or object-attribute pairs from images. To address this, we propose \textbf{MiDRE} (\textbf{Mi}xture of \textbf{D}ual \textbf{R}easoning \textbf{E}xperts), which integrates an internal reasoning expert for detecting incongruities within the image-text pair and an external reasoning expert that utilizes structured rationales generated via Chain-of-Thought prompting to a Large Vision-Language Model. An adaptive gating mechanism dynamically weighs the two experts, selecting the most relevant reasoning path.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions · Hate Speech and Cyberbullying Detection · Authorship Attribution and Profiling