CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model
Cheng Chen, Junchen Zhu, Xu Luo, Hengtao Shen, Lianli Gao, and Jingkuan Song

TL;DR
This paper introduces CoIN, a comprehensive benchmark for evaluating Multimodal Large Language Models in continual instruction tuning, revealing current models' challenges with forgetting and proposing a method to improve retention.
Contribution
The paper presents CoIN, a new benchmark for continual instruction tuning of MLLMs, and proposes MoELoRA to mitigate instruction alignment forgetting during sequential learning.
Findings
Current MLLMs suffer from catastrophic forgetting in instruction tuning.
Instruction alignment failure is the main cause of performance decline.
MoELoRA effectively reduces forgetting in instruction alignment.
Abstract
Instruction tuning represents a prevalent strategy employed by Multimodal Large Language Models (MLLMs) to align with human instructions and adapt to new tasks. Nevertheless, MLLMs encounter the challenge of adapting to users' evolving knowledge and demands. Therefore, how to retain existing skills while acquiring new knowledge needs to be investigated. In this paper, we present a comprehensive benchmark, namely Continual Instruction tuNing (CoIN), to assess existing MLLMs in the sequential instruction tuning paradigm. CoIN comprises 10 commonly used datasets spanning 8 task categories, ensuring a diverse range of instructions and tasks. Besides, the trained model is evaluated from two aspects: Instruction Following and General Knowledge, which assess the alignment with human intention and knowledge preserved for reasoning, respectively. Experiments on CoIN demonstrate that current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsALIGN
