Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

Jian Lang; Zhangtao Cheng; Ting Zhong; Fan Zhou

arXiv:2501.01120·cs.CV·June 17, 2025

Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

Jian Lang, Zhangtao Cheng, Ting Zhong, Fan Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces RAGPT, a retrieval-augmented dynamic prompt tuning framework that significantly improves multimodal learning robustness with incomplete data by generating context-aware prompts through retrieval and missing information recovery.

Contribution

The paper proposes a novel retrieval-augmented dynamic prompt tuning method that addresses limitations of static prompts and dummy imputation in incomplete multimodal learning.

Findings

01

RAGPT outperforms baselines on three real-world datasets.

02

Dynamic prompts enhance robustness against missing modalities.

03

Retrieval-based missing information recovery improves task performance.

Abstract

Multimodal learning with incomplete modality is practical and challenging. Recently, researchers have focused on enhancing the robustness of pre-trained MultiModal Transformers (MMTs) under missing modality conditions by applying learnable prompts. However, these prompt-based methods face several limitations: (1) incomplete modalities provide restricted modal cues for task-specific inference, (2) dummy imputation for missing content causes information loss and introduces noise, and (3) static prompts are instance-agnostic, offering limited knowledge for instances with various missing conditions. To address these issues, we propose RAGPT, a novel Retrieval-AuGmented dynamic Prompt Tuning framework. RAGPT comprises three modules: (I) the multi-channel retriever, which identifies similar instances through a within-modality retrieval strategy, (II) the missing modality generator, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jian-lang/ragpt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems