MCITlib: Multimodal Continual Instruction Tuning Library and Benchmark
Haiyang Guo, Fei Zhu, Hongbo Zhao, Fanhu Zeng, Wenzhuo Liu, Shijie Ma, Da-Han Wang, Xu-Yao Zhang

TL;DR
MCITlib is a new library and benchmark suite designed to facilitate research in Multimodal Continual Learning, addressing challenges like catastrophic forgetting and cross-modal coordination in Multimodal Large Language Models.
Contribution
It introduces a comprehensive library implementing 8 algorithms and evaluates them on 3 benchmarks, supporting progress in Multimodal Continual Learning research.
Findings
Implemented 8 representative algorithms.
Evaluated on 3 benchmarks with 2 backbone models.
Provides a platform for future MCL developments.
Abstract
Continual learning enables AI systems to acquire new knowledge while retaining previously learned information. While traditional unimodal methods have made progress, the rise of Multimodal Large Language Models (MLLMs) brings new challenges in Multimodal Continual Learning (MCL), where models are expected to address both catastrophic forgetting and cross-modal coordination. To advance research in this area, we present MCITlib, a comprehensive library for Multimodal Continual Instruction Tuning. MCITlib currently implements 8 representative algorithms and conducts evaluations on 3 benchmarks under 2 backbone models. The library will be continuously updated to support future developments in MCL. The codebase is released at https://github.com/Ghy0501/MCITlib.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Educational Tools and Methods
