Multi-Modal Continual Learning via Cross-Modality Adapters and Representation Alignment with Knowledge Preservation
Evelyn Chee, Wynne Hsu, Mong Li Lee

TL;DR
This paper introduces a novel multi-modal continual learning framework that leverages cross-modality adapters and representation alignment to effectively integrate diverse sensory data while preventing catastrophic forgetting.
Contribution
It proposes a pre-trained model-based approach with a mixture-of-experts adapter and a new representation alignment loss for multi-modal continual learning.
Findings
Outperforms baselines in class-incremental learning
Achieves higher accuracy on multi-modal datasets
Reduces catastrophic forgetting effectively
Abstract
Continual learning is essential for adapting models to new tasks while retaining previously acquired knowledge. While existing approaches predominantly focus on uni-modal data, multi-modal learning offers substantial benefits by utilizing diverse sensory inputs, akin to human perception. However, multi-modal continual learning presents additional challenges, as the model must effectively integrate new information from various modalities while preventing catastrophic forgetting. In this work, we propose a pre-trained model-based framework for multi-modal continual learning. Our framework includes a novel cross-modality adapter with a mixture-of-experts structure to facilitate effective integration of multi-modal information across tasks. We also introduce a representation alignment loss that fosters learning of robust multi-modal representations, and regularize relationships between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Visual Attention and Saliency Detection
