Dual Modality-Aware Gated Prompt Tuning for Few-Shot Multimodal Sarcasm Detection
Soumyadeep Jana, Abhrajyoti Kundu, Sanasam Ranbir Singh

TL;DR
This paper introduces DMDP, a novel few-shot multimodal sarcasm detection framework that employs gated, modality-specific deep prompts and cross-modal alignment to improve detection accuracy with limited labeled data.
Contribution
DMDP is the first to use hierarchical, modality-specific deep prompts with cross-modal alignment for few-shot multimodal sarcasm detection, enhancing feature learning and interaction.
Findings
DMDP outperforms baseline methods in few-shot settings.
The model generalizes well across different datasets.
Hierarchical prompts improve sarcasm detection accuracy.
Abstract
The widespread use of multimodal content on social media has heightened the need for effective sarcasm detection to improve opinion mining. However, existing models rely heavily on large annotated datasets, making them less suitable for real-world scenarios where labeled data is scarce. This motivates the need to explore the problem in a few-shot setting. To this end, we introduce DMDP (Deep Modality-Disentangled Prompt Tuning), a novel framework for few-shot multimodal sarcasm detection. Unlike prior methods that use shallow, unified prompts across modalities, DMDP employs gated, modality-specific deep prompts for text and visual encoders. These prompts are injected across multiple layers to enable hierarchical feature learning and better capture diverse sarcasm types. To enhance intra-modal learning, we incorporate a prompt-sharing mechanism across layers, allowing the model to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
