Exemplar Masking for Multimodal Incremental Learning

Yi-Lun Lee; Chen-Yu Lee; Wei-Chen Chiu; Yi-Hsuan Tsai

arXiv:2412.09549·cs.CV·December 13, 2024

Exemplar Masking for Multimodal Incremental Learning

Yi-Lun Lee, Chen-Yu Lee, Wei-Chen Chiu, Yi-Hsuan Tsai

PDF

Open Access 1 Repo

TL;DR

This paper introduces an exemplar masking framework for multimodal incremental learning that reduces storage and computational costs while improving knowledge retention, using attention-based token masking and data augmentation techniques.

Contribution

It proposes a novel exemplar masking method combined with parameter-efficient tuning to enhance multimodal incremental learning efficiency and robustness.

Findings

01

Reduces exemplar storage size significantly.

02

Improves performance in retaining old knowledge.

03

Extends to real-world multimodal datasets.

Abstract

Multimodal incremental learning needs to digest the information from multiple modalities while concurrently learning new knowledge without forgetting the previously learned information. There are numerous challenges for this task, mainly including the larger storage size of multimodal data in exemplar-based methods and the computational requirement of finetuning on huge multimodal models. In this paper, we leverage the parameter-efficient tuning scheme to reduce the burden of fine-tuning and propose the exemplar masking framework to efficiently replay old knowledge. Specifically, the non-important tokens are masked based on the attention weights and the correlation across different modalities, significantly reducing the storage size of an exemplar and consequently saving more exemplars under the same memory buffer. Moreover, we design a multimodal data augmentation technique to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yilunlee/exemplar_masking_mcil
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling

MethodsSoftmax · Attention Is All You Need