Continual-NExT: A Unified Comprehension And Generation Continual Learning Framework

Jingyang Qiao; Zhizhong Zhang; Xin Tan; Jingyu Gong; Yanyun Qu; Yuan Xie

arXiv:2602.18055·cs.LG·February 23, 2026

Continual-NExT: A Unified Comprehension And Generation Continual Learning Framework

Jingyang Qiao, Zhizhong Zhang, Xin Tan, Jingyu Gong, Yanyun Qu, Yuan Xie

PDF

Open Access

TL;DR

This paper introduces Continual-NExT, a framework for lifelong learning in Dual-to-Dual Multimodal Large Language Models, addressing challenges like catastrophic forgetting and knowledge transfer with a novel MAGE method.

Contribution

The paper proposes a standardized continual learning framework and a new MAGE method to enhance knowledge transfer and reduce forgetting in Dual-to-Dual MLLMs.

Findings

01

MAGE outperforms existing continual learning methods.

02

Continual-NExT achieves state-of-the-art results.

03

Framework effectively mitigates catastrophic forgetting.

Abstract

Dual-to-Dual MLLMs refer to Multimodal Large Language Models, which can enable unified multimodal comprehension and generation through text and image modalities. Although exhibiting strong instantaneous learning and generalization capabilities, Dual-to-Dual MLLMs still remain deficient in lifelong evolution, significantly affecting continual adaptation to dynamic real-world scenarios. One of the challenges is that learning new tasks inevitably destroys the learned knowledge. Beyond traditional catastrophic forgetting, Dual-to-Dual MLLMs face other challenges, including hallucination, instruction unfollowing, and failures in cross-modal knowledge transfer. However, no standardized continual learning framework for Dual-to-Dual MLLMs has been established yet, leaving these challenges unexplored. Thus, in this paper, we establish Continual-NExT, a continual learning framework for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis