MoPE: Mixture of Prompt Experts for Parameter-Efficient and Scalable Multimodal Fusion
Ruixiang Jiang, Lingbo Liu, Changwen Chen

TL;DR
MoPE introduces a scalable, parameter-efficient multimodal fusion framework that generates instance-specific prompts using a mixture of experts, significantly improving adaptivity and performance across diverse multimodal tasks.
Contribution
The paper proposes MoPE, a novel mixture of prompt experts framework that enhances prompt adaptivity and scalability for multimodal fusion by generating instance-specific prompts.
Findings
Achieves state-of-the-art results on six multimodal datasets.
Requires only 0.8% of trainable parameters compared to fine-tuning.
Effectively scales with the number of experts without increasing prompt length.
Abstract
Despite the demonstrated parameter efficiency of prompt-based fusion, its limited adaptivity and expressiveness hinder its effectiveness for multimodal applications at scale. In this paper, we present the first comprehensive study addressing these limitations. Our key motivation is to ``divide and conquer'' the vanilla prompt, traditionally shared across all instances, by generating instance-specific prompts. Specifically, we propose the Mixture of Prompt Experts (MoPE), a framework that significantly enhances prompt adaptivity and expressiveness by dynamically generating instance-specific prompts. MoPE leverages multimodal pairings as additional evidence, allowing the model to adaptively select optimal prompts tailored to each individual instance. Unlike traditional prompt-fusion methods, which encounter scalability bottlenecks when optimizing long unified prompts, MoPE maintains fixed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · Anomaly Detection Techniques and Applications · Speech Recognition and Synthesis
MethodsFocus · Mixture of Experts
