Scaling Novel Object Detection with Weakly Supervised Detection   Transformers

Tyler LaBonte; Yale Song; Xin Wang; Vibhav Vineet; Neel Joshi

arXiv:2207.05205·cs.CV·May 29, 2023

Scaling Novel Object Detection with Weakly Supervised Detection Transformers

Tyler LaBonte, Yale Song, Xin Wang, Vibhav Vineet, Neel Joshi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a Weakly Supervised Detection Transformer that enhances novel object detection by leveraging large-scale pretraining, improving efficiency and performance over existing WSOD methods, especially in large-scale scenarios.

Contribution

The paper presents a novel WSOD framework using transformers that enables effective knowledge transfer from large pretraining datasets to detect many novel objects.

Findings

01

Outperforms previous state-of-the-art WSOD models on large-scale datasets

02

Class quantity is more crucial than image quantity for WSOD pretraining

03

The proposed method reduces training rounds and refinement steps

Abstract

A critical object detection task is finetuning an existing model to detect novel objects, but the standard workflow requires bounding box annotations which are time-consuming and expensive to collect. Weakly supervised object detection (WSOD) offers an appealing alternative, where object detectors can be trained using image-level labels. However, the practical application of current WSOD models is limited, as they only operate at small data scales and require multiple rounds of training and refinement. To address this, we propose the Weakly Supervised Detection Transformer, which enables efficient knowledge transfer from a large-scale pretraining dataset to WSOD finetuning on hundreds of novel objects. Additionally, we leverage pretrained knowledge to improve the multiple instance learning (MIL) framework often used in WSOD methods. Our experiments show that our approach outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tmlabonte/weakly-supervised-DETR
pytorchOfficial

Videos

Scaling Novel Object Detection with Weakly Supervised Detection Transformers· youtube

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Dense Connections · Absolute Position Encodings · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Adam