Dual Modality-Aware Gated Prompt Tuning for Few-Shot Multimodal Sarcasm Detection

Soumyadeep Jana; Abhrajyoti Kundu; Sanasam Ranbir Singh

arXiv:2507.04468·cs.CL·July 8, 2025

Dual Modality-Aware Gated Prompt Tuning for Few-Shot Multimodal Sarcasm Detection

Soumyadeep Jana, Abhrajyoti Kundu, Sanasam Ranbir Singh

PDF

TL;DR

This paper introduces DMDP, a novel few-shot multimodal sarcasm detection framework that employs gated, modality-specific deep prompts and cross-modal alignment to improve detection accuracy with limited labeled data.

Contribution

DMDP is the first to use hierarchical, modality-specific deep prompts with cross-modal alignment for few-shot multimodal sarcasm detection, enhancing feature learning and interaction.

Findings

01

DMDP outperforms baseline methods in few-shot settings.

02

The model generalizes well across different datasets.

03

Hierarchical prompts improve sarcasm detection accuracy.

Abstract

The widespread use of multimodal content on social media has heightened the need for effective sarcasm detection to improve opinion mining. However, existing models rely heavily on large annotated datasets, making them less suitable for real-world scenarios where labeled data is scarce. This motivates the need to explore the problem in a few-shot setting. To this end, we introduce DMDP (Deep Modality-Disentangled Prompt Tuning), a novel framework for few-shot multimodal sarcasm detection. Unlike prior methods that use shallow, unified prompts across modalities, DMDP employs gated, modality-specific deep prompts for text and visual encoders. These prompts are injected across multiple layers to enable hierarchical feature learning and better capture diverse sarcasm types. To enhance intra-modal learning, we incorporate a prompt-sharing mechanism across layers, allowing the model to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.