CRISP: Contrastive Residual Injection and Semantic Prompting for Continual Video Instance Segmentation
Baichen Liu, Qi Lyu, Xudong Wang, Jiahua Dong, Lianqing Liu, Zhi Han

TL;DR
CRISP introduces a novel framework combining contrastive residual injection and semantic prompting to improve continual video instance segmentation, effectively balancing learning new categories while retaining previous knowledge.
Contribution
It proposes a new method with instance-wise, category-wise, and task-wise modules, addressing confusion and catastrophic forgetting in continual segmentation tasks.
Findings
Outperforms existing methods on YouTube-VIS datasets
Effectively reduces catastrophic forgetting
Improves segmentation and classification accuracy
Abstract
Continual video instance segmentation demands both the plasticity to absorb new object categories and the stability to retain previously learned ones, all while preserving temporal consistency across frames. In this work, we introduce Contrastive Residual Injection and Semantic Prompting (CRISP), an earlier attempt tailored to address the instance-wise, category-wise, and task-wise confusion in continual video instance segmentation. For instance-wise learning, we model instance tracking and construct instance correlation loss, which emphasizes the correlation with the prior query space while strengthening the specificity of the current task query. For category-wise learning, we build an adaptive residual semantic prompt (ARSP) learning framework, which constructs a learnable semantic residual prompt pool generated by category text and uses an adjustive query-prompt matching mechanism to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
