CASA: Class-Agnostic Shared Attributes in Vision-Language Models for Efficient Incremental Object Detection
Mingyi Guo, Yuyang Liu, Zhiyuan Yan, Zongying Lin, Peixi Peng and, Yonghong Tian

TL;DR
This paper introduces CASA, a novel approach for incremental object detection that mitigates catastrophic forgetting by learning shared, category-agnostic attributes, leveraging language models and attribute selection to improve performance on sequential tasks.
Contribution
CASA is the first method to incorporate shared, category-agnostic attributes generated by language models for incremental object detection, enhancing knowledge retention and adaptability.
Findings
Achieves state-of-the-art results on COCO dataset.
Effectively mitigates catastrophic forgetting in incremental detection.
Utilizes language models for attribute generation and selection.
Abstract
Incremental object detection is fundamentally challenged by catastrophic forgetting. A major factor contributing to this issue is background shift, where background categories in sequential tasks may overlap with either previously learned or future unseen classes. To address this, we propose a novel method called Class-Agnostic Shared Attribute Base (CASA) that encourages the model to learn category-agnostic attributes shared across incremental classes. Our approach leverages an LLM to generate candidate textual attributes, selects the most relevant ones based on the current training data, and records their importance in an assignment matrix. For subsequent tasks, the retained attributes are frozen, and new attributes are selected from the remaining candidates, ensuring both knowledge retention and adaptability. Extensive experiments on the COCO dataset demonstrate the state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
MethodsContrastive Language-Image Pre-training · Balanced Selection
