CAT: LoCalization and IdentificAtion Cascade Detection Transformer for Open-World Object Detection
Shuailei Ma, Yuefeng Wang, Jiaqi Fan, Ying Wei, Thomas H. Li, Hongli, Liu, Fanbing Lv

TL;DR
This paper introduces CAT, a cascade detection transformer that decouples localization and identification for open-world object detection, utilizing a self-adaptive pseudo-labeling mechanism to improve unknown object detection and incremental learning.
Contribution
The paper proposes a novel cascade detection transformer with a self-adaptive pseudo-labeling mechanism, enhancing open-world object detection by better localizing and identifying known and unknown objects.
Findings
Outperforms state-of-the-art on MS-COCO and PASCAL VOC datasets.
Significantly improves detection of unknown objects and incremental learning.
Achieves higher metrics in OWOD, IOD, and open-set detection tasks.
Abstract
Open-world object detection (OWOD), as a more general and challenging goal, requires the model trained from data on known objects to detect both known and unknown objects and incrementally learn to identify these unknown objects. The existing works which employ standard detection framework and fixed pseudo-labelling mechanism (PLM) have the following problems: (i) The inclusion of detecting unknown objects substantially reduces the model's ability to detect known ones. (ii) The PLM does not adequately utilize the priori knowledge of inputs. (iii) The fixed selection manner of PLM cannot guarantee that the model is trained in the right direction. We observe that humans subconsciously prefer to focus on all foreground objects and then identify each one in detail, rather than localize and identify a single object simultaneously, for alleviating the confusion. This motivates us to propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Softmax · Adam · Byte Pair Encoding · Residual Connection · Label Smoothing · Dropout · Dense Connections
