MPA: Multimodal Prototype Augmentation for Few-Shot Learning
Liwen Wu, Wei Wang, Lei Zhao, Zhan Gao, Qika Lin, Shaowen Yao, Zuozhu Liu, Bin Pu

TL;DR
The paper introduces MPA, a multimodal prototype augmentation framework for few-shot learning that leverages language models, multi-view augmentations, and uncertainty modeling to significantly improve classification performance across diverse benchmarks.
Contribution
The novel MPA framework combines semantic enhancement, multi-view augmentation, and uncertainty absorption to advance few-shot learning with multimodal information.
Findings
MPA outperforms state-of-the-art methods on multiple benchmarks.
Achieves 12.29% and 24.56% improvements in 5-way 1-shot tasks.
Effective integration of semantic and visual augmentations enhances model robustness.
Abstract
Recently, few-shot learning (FSL) has become a popular task that aims to recognize new classes from only a few labeled examples and has been widely applied in fields such as natural science, remote sensing, and medical images. However, most existing methods focus only on the visual modality and compute prototypes directly from raw support images, which lack comprehensive and rich multimodal information. To address these limitations, we propose a novel Multimodal Prototype Augmentation FSL framework called MPA, including LLM-based Multi-Variant Semantic Enhancement (LMSE), Hierarchical Multi-View Augmentation (HMA), and an Adaptive Uncertain Class Absorber (AUCA). LMSE leverages large language models to generate diverse paraphrased category descriptions, enriching the support set with additional semantic cues. HMA exploits both natural and multi-view augmentations to enhance feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
