Learning Background Prompts to Discover Implicit Knowledge for Open   Vocabulary Object Detection

Jiaming Li; Jiacheng Zhang; Jichang Li; Ge Li; Si Liu; Liang Lin,; Guanbin Li

arXiv:2406.00510·cs.CV·June 4, 2024

Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection

Jiaming Li, Jiacheng Zhang, Jichang Li, Ge Li, Si Liu, Liang Lin,, Guanbin Li

PDF

Open Access

TL;DR

This paper introduces a novel open vocabulary object detection framework that learns background prompts to utilize implicit background knowledge, improving detection of both known and unknown object categories.

Contribution

The proposed LBP framework uniquely learns background prompts and modules to better interpret background information, enhancing open vocabulary detection performance.

Findings

01

Outperforms state-of-the-art methods on OV-COCO and OV-LVIS datasets.

02

Effectively leverages implicit background knowledge for improved detection.

03

Enhances recognition of both base and novel categories.

Abstract

Open vocabulary object detection (OVD) aims at seeking an optimal object detector capable of recognizing objects from both base and novel categories. Recent advances leverage knowledge distillation to transfer insightful knowledge from pre-trained large-scale vision-language models to the task of object detection, significantly generalizing the powerful capabilities of the detector to identify more unknown object categories. However, these methods face significant challenges in background interpretation and model overfitting and thus often result in the loss of crucial background knowledge, giving rise to sub-optimal inference performance of the detector. To mitigate these issues, we present a novel OVD framework termed LBP to propose learning background prompts to harness explored implicit background knowledge, thus enhancing the detection performance w.r.t. base and novel categories.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsBalanced Selection · Knowledge Distillation