LLaVA-c: Continual Improved Visual Instruction Tuning

Wenzhuo Liu; Fei Zhu; Haiyang Guo; Longhui Wei; Cheng-Lin Liu

arXiv:2506.08666·cs.CV·June 16, 2025

LLaVA-c: Continual Improved Visual Instruction Tuning

Wenzhuo Liu, Fei Zhu, Haiyang Guo, Longhui Wei, Cheng-Lin Liu

PDF

Open Access

TL;DR

LLaVA-c introduces a continual learning approach with spectral-aware consolidation and inquiry regularization, improving visual instruction tuning by balancing tasks and preventing model degradation, outperforming traditional multitask methods.

Contribution

It proposes a novel continual learning method for visual instruction tuning that maintains general capabilities while achieving competitive performance.

Findings

01

Enhanced benchmark performance across tasks

02

Preserved general capabilities during continual learning

03

Achieved or surpassed multitask joint learning results

Abstract

Multimodal models like LLaVA-1.5 achieve state-of-the-art visual understanding through visual instruction tuning on multitask datasets, enabling strong instruction-following and multimodal performance. However, multitask learning faces challenges such as task balancing, requiring careful adjustment of data proportions, and expansion costs, where new tasks risk catastrophic forgetting and need costly retraining. Continual learning provides a promising alternative to acquiring new knowledge incrementally while preserving existing capabilities. However, current methods prioritize task-specific performance, neglecting base model degradation from overfitting to specific instructions, which undermines general capabilities. In this work, we propose a simple but effective method with two modifications on LLaVA-1.5: spectral-aware consolidation for improved task balance and unsupervised inquiry…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications

MethodsBalanced Selection