Learning to Detect Every Thing in an Open World

Kuniaki Saito; Ping Hu; Trevor Darrell; Kate Saenko

arXiv:2112.01698·cs.CV·April 14, 2022·1 cites

Learning to Detect Every Thing in an Open World

Kuniaki Saito, Ping Hu, Trevor Darrell, Kate Saenko

PDF

Open Access

TL;DR

This paper introduces LDET, a training scheme that improves open-world object detection and segmentation by augmenting data with pasted objects and decoupling training phases, leading to better generalization to unseen objects.

Contribution

The paper proposes a novel data augmentation and training approach called LDET that enhances detection of unlabeled objects in open-world scenarios.

Findings

01

Significant improvements on open-world instance segmentation datasets.

02

Outperforms baselines on cross-category generalization on COCO.

03

Achieves better cross-dataset performance on UVO and Cityscapes.

Abstract

Many open-world applications require the detection of novel objects, yet state-of-the-art object detection and instance segmentation networks do not excel at this task. The key issue lies in their assumption that regions without any annotations should be suppressed as negatives, which teaches the model to treat the unannotated objects as background. To address this issue, we propose a simple yet surprisingly powerful data augmentation and training scheme we call Learning to Detect Every Thing (LDET). To avoid suppressing hidden objects, background objects that are visible but unlabeled, we paste annotated objects on a background image sampled from a small region of the original image. Since training solely on such synthetically-augmented images suffers from domain shift, we decouple the training into two parts: 1) training the region classification and regression head on augmented…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Remote-Sensing Image Classification · Video Surveillance and Tracking Methods