SEFE: Superficial and Essential Forgetting Eliminator for Multimodal   Continual Instruction Tuning

Jinpeng Chen; Runmin Cong; Yuzhi Zhao; Hongzheng Yang; Guangneng Hu,; Horace Ho Shing Ip; Sam Kwong

arXiv:2505.02486·cs.LG·May 6, 2025

SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning

Jinpeng Chen, Runmin Cong, Yuzhi Zhao, Hongzheng Yang, Guangneng Hu,, Horace Ho Shing Ip, Sam Kwong

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

This paper introduces SEFE, a method combining Answer Style Diversification and RegLoRA to mitigate superficial and essential forgetting in multimodal continual instruction tuning, achieving state-of-the-art results.

Contribution

It proposes a novel framework that distinguishes and addresses superficial and essential forgetting in multimodal models, with new techniques for style diversification and parameter regularization.

Findings

01

SEFE outperforms existing methods on benchmark tasks.

02

The Answer Style Diversification effectively prevents superficial forgetting.

03

RegLoRA stabilizes key parameters, reducing essential forgetting.

Abstract

Multimodal Continual Instruction Tuning (MCIT) aims to enable Multimodal Large Language Models (MLLMs) to incrementally learn new tasks without catastrophic forgetting. In this paper, we explore forgetting in this context, categorizing it into superficial forgetting and essential forgetting. Superficial forgetting refers to cases where the model's knowledge may not be genuinely lost, but its responses to previous tasks deviate from expected formats due to the influence of subsequent tasks' answer styles, making the results unusable. By contrast, essential forgetting refers to situations where the model provides correctly formatted but factually inaccurate answers, indicating a true loss of knowledge. Assessing essential forgetting necessitates addressing superficial forgetting first, as severe superficial forgetting can obscure the model's knowledge state. Hence, we first introduce the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jinpeng0528/sefe
pytorchOfficial

Datasets

jinpeng0528/CoIN-ASD
dataset· 120 dl
120 dl

Videos

SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning· slideslive

Taxonomy

TopicsSpeech and dialogue systems · Speech and Audio Processing · Speech Recognition and Synthesis