Weakly Supervised Deep Detection Networks

Hakan Bilen; Andrea Vedaldi

arXiv:1511.02853·cs.CV·December 20, 2016

Weakly Supervised Deep Detection Networks

Hakan Bilen, Andrea Vedaldi

PDF

5 Repos

TL;DR

This paper introduces a deep learning architecture for weakly supervised object detection that leverages pre-trained convolutional neural networks to simultaneously select regions and classify objects, outperforming previous methods.

Contribution

It proposes a novel end-to-end deep detection network that implicitly learns object detectors from image-level labels, improving weakly supervised detection performance.

Findings

01

Outperforms existing weakly supervised detection systems on PASCAL VOC

02

Implicitly learns object detectors from image-level classification

03

Outperforms standard data augmentation and fine-tuning techniques

Abstract

Weakly supervised learning of object detection is an important problem in image understanding that still does not have a satisfactory solution. In this paper, we address this problem by exploiting the power of deep convolutional neural networks pre-trained on large-scale image-level classification tasks. We propose a weakly supervised deep detection architecture that modifies one such network to operate at the level of image regions, performing simultaneously region selection and classification. Trained as an image classifier, the architecture implicitly learns object detectors that are better than alternative weakly supervised detection systems on the PASCAL VOC data. The model, which is a simple and elegant end-to-end architecture, outperforms standard data augmentation and fine-tuning techniques for the task of image-level classification as well.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.