LLaVA-c: Continual Improved Visual Instruction Tuning
Wenzhuo Liu, Fei Zhu, Haiyang Guo, Longhui Wei, Cheng-Lin Liu

TL;DR
LLaVA-c introduces a continual learning approach with spectral-aware consolidation and inquiry regularization, improving visual instruction tuning by balancing tasks and preventing model degradation, outperforming traditional multitask methods.
Contribution
It proposes a novel continual learning method for visual instruction tuning that maintains general capabilities while achieving competitive performance.
Findings
Enhanced benchmark performance across tasks
Preserved general capabilities during continual learning
Achieved or surpassed multitask joint learning results
Abstract
Multimodal models like LLaVA-1.5 achieve state-of-the-art visual understanding through visual instruction tuning on multitask datasets, enabling strong instruction-following and multimodal performance. However, multitask learning faces challenges such as task balancing, requiring careful adjustment of data proportions, and expansion costs, where new tasks risk catastrophic forgetting and need costly retraining. Continual learning provides a promising alternative to acquiring new knowledge incrementally while preserving existing capabilities. However, current methods prioritize task-specific performance, neglecting base model degradation from overfitting to specific instructions, which undermines general capabilities. In this work, we propose a simple but effective method with two modifications on LLaVA-1.5: spectral-aware consolidation for improved task balance and unsupervised inquiry…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
MethodsBalanced Selection
