Zero-Shot Object Detection

Ankan Bansal; Karan Sikka; Gaurav Sharma; Rama Chellappa and; Ajay Divakaran

arXiv:1804.04340·cs.CV·July 30, 2018

Zero-Shot Object Detection

Ankan Bansal, Karan Sikka, Gaurav Sharma, Rama Chellappa and, Ajay Divakaran

PDF

TL;DR

This paper introduces a new approach to zero-shot object detection that enables models to identify unseen classes by leveraging visual-semantic embeddings and background-aware learning, with extensive experiments on standard datasets.

Contribution

It presents a novel zero-shot detection framework with background-aware methods and dense semantic sampling, advancing the ability to detect unseen object classes.

Findings

01

Effective detection of unseen classes demonstrated on MSCOCO and VisualGenome.

02

Background-aware approaches improve zero-shot detection robustness.

03

Dense semantic sampling enhances generalization to novel categories.

Abstract

We introduce and tackle the problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training. We work with a challenging set of object classes, not restricting ourselves to similar and/or fine-grained categories as in prior works on zero-shot classification. We present a principled approach by first adapting visual-semantic embeddings for ZSD. We then discuss the problems associated with selecting a background class and motivate two background-aware approaches for learning robust detectors. One of these models uses a fixed background class and the other is based on iterative latent assignments. We also outline the challenge associated with using a limited number of training classes and propose a solution based on dense sampling of the semantic label space using auxiliary data with a large number of categories. We propose novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.