TL;DR
This paper introduces VILA, a dual-branch framework that enhances class-incremental learning with pre-trained models by combining vision-language calibration to improve stability without sacrificing efficiency.
Contribution
VILA advances analytic class-incremental learning by integrating geometric and semantic calibration strategies, addressing representation rigidity and prediction bias.
Findings
VILA outperforms existing methods across eight benchmarks.
It maintains efficiency while improving stability in fine-grained scenarios.
The framework effectively combines feature-level and decision-level calibration.
Abstract
Class-incremental learning (CIL) with pre-trained models (PTMs) faces a critical trade-off between efficient adaptation and long-term stability. While analytic learning enables rapid, recursive closed-form updates, its efficacy is often compromised by accumulated errors and feature incompatibility. In this paper, we first conduct a systematic study to dissect the failure modes of PTM-based analytic CIL, identifying representation rigidity as the primary bottleneck. Motivated by this insight, we propose VILA, a novel dual-branch framework that advances analytic CIL via a two-level vision-language calibration strategy. Specifically, we coherently fuse plastic, task-adapted features with a frozen, universal visual anchor at the feature level through geometric calibration, and leverage cross-modal semantic priors at the decision level to rectify prediction bias. This confluence maintains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
