CRISP: Contrastive Residual Injection and Semantic Prompting for Continual Video Instance Segmentation

Baichen Liu; Qi Lyu; Xudong Wang; Jiahua Dong; Lianqing Liu; Zhi Han

arXiv:2508.10432·cs.CV·August 15, 2025

CRISP: Contrastive Residual Injection and Semantic Prompting for Continual Video Instance Segmentation

Baichen Liu, Qi Lyu, Xudong Wang, Jiahua Dong, Lianqing Liu, Zhi Han

PDF

TL;DR

CRISP introduces a novel framework combining contrastive residual injection and semantic prompting to improve continual video instance segmentation, effectively balancing learning new categories while retaining previous knowledge.

Contribution

It proposes a new method with instance-wise, category-wise, and task-wise modules, addressing confusion and catastrophic forgetting in continual segmentation tasks.

Findings

01

Outperforms existing methods on YouTube-VIS datasets

02

Effectively reduces catastrophic forgetting

03

Improves segmentation and classification accuracy

Abstract

Continual video instance segmentation demands both the plasticity to absorb new object categories and the stability to retain previously learned ones, all while preserving temporal consistency across frames. In this work, we introduce Contrastive Residual Injection and Semantic Prompting (CRISP), an earlier attempt tailored to address the instance-wise, category-wise, and task-wise confusion in continual video instance segmentation. For instance-wise learning, we model instance tracking and construct instance correlation loss, which emphasizes the correlation with the prior query space while strengthening the specificity of the current task query. For category-wise learning, we build an adaptive residual semantic prompt (ARSP) learning framework, which constructs a learnable semantic residual prompt pool generated by category text and uses an adjustive query-prompt matching mechanism to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.