Exploration of visual prompt in Grounded pre-trained open-set detection

Qibo Chen; Weizhong Jin; Shuchang Li; Mengdi Liu; Li Yu; Jian Jiang,; Xiaozheng Wang

arXiv:2312.08839·cs.CV·December 15, 2023·1 cites

Exploration of visual prompt in Grounded pre-trained open-set detection

Qibo Chen, Weizhong Jin, Shuchang Li, Mengdi Liu, Li Yu, Jian Jiang,, Xiaozheng Wang

PDF

Open Access

TL;DR

This paper introduces a novel visual prompt approach for open-set object detection that learns from few labeled images, improving generalization to new categories without manual prompt design.

Contribution

The paper proposes a statistical-based visual prompt construction method and task-specific similarity dictionaries to enhance open-set detection performance.

Findings

01

Outperforms existing prompt learning methods on ODinW dataset

02

More consistent in combinatorial inference

03

Effectively models new categories with few labeled images

Abstract

Text prompts are crucial for generalizing pre-trained open-set object detection models to new categories. However, current methods for text prompts are limited as they require manual feedback when generalizing to new categories, which restricts their ability to model complex scenes, often leading to incorrect detection results. To address this limitation, we propose a novel visual prompt method that learns new category knowledge from a few labeled images, which generalizes the pre-trained detection model to the new category. To allow visual prompts to represent new categories adequately, we propose a statistical-based prompt construction module that is not limited by predefined vocabulary lengths, thus allowing more vectors to be used when representing categories. We further utilize the category dictionaries in the pre-training dataset to design task-specific similarity dictionaries,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling