MCITlib: Multimodal Continual Instruction Tuning Library and Benchmark

Haiyang Guo; Fei Zhu; Hongbo Zhao; Fanhu Zeng; Wenzhuo Liu; Shijie Ma; Da-Han Wang; Xu-Yao Zhang

arXiv:2508.07307·cs.CV·January 1, 2026

MCITlib: Multimodal Continual Instruction Tuning Library and Benchmark

Haiyang Guo, Fei Zhu, Hongbo Zhao, Fanhu Zeng, Wenzhuo Liu, Shijie Ma, Da-Han Wang, Xu-Yao Zhang

PDF

Open Access

TL;DR

MCITlib is a new library and benchmark suite designed to facilitate research in Multimodal Continual Learning, addressing challenges like catastrophic forgetting and cross-modal coordination in Multimodal Large Language Models.

Contribution

It introduces a comprehensive library implementing 8 algorithms and evaluates them on 3 benchmarks, supporting progress in Multimodal Continual Learning research.

Findings

01

Implemented 8 representative algorithms.

02

Evaluated on 3 benchmarks with 2 backbone models.

03

Provides a platform for future MCL developments.

Abstract

Continual learning enables AI systems to acquire new knowledge while retaining previously learned information. While traditional unimodal methods have made progress, the rise of Multimodal Large Language Models (MLLMs) brings new challenges in Multimodal Continual Learning (MCL), where models are expected to address both catastrophic forgetting and cross-modal coordination. To advance research in this area, we present MCITlib, a comprehensive library for Multimodal Continual Instruction Tuning. MCITlib currently implements 8 representative algorithms and conducts evaluations on 3 benchmarks under 2 backbone models. The library will be continuously updated to support future developments in MCL. The codebase is released at https://github.com/Ghy0501/MCITlib.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Educational Tools and Methods