Active Object Localization with Deep Reinforcement Learning

Juan C. Caicedo; Svetlana Lazebnik

arXiv:1511.06015·cs.CV·November 20, 2015

Active Object Localization with Deep Reinforcement Learning

Juan C. Caicedo, Svetlana Lazebnik

PDF

Open Access 3 Repos

TL;DR

This paper introduces a deep reinforcement learning-based active detection model for class-specific object localization that efficiently identifies target objects by deforming bounding boxes through learned actions, reducing the number of regions analyzed.

Contribution

It presents a novel deep reinforcement learning approach for active object localization that outperforms non-proposal-based systems on Pascal VOC 2007.

Findings

01

Localized objects after analyzing 11-25 regions

02

Achieved top detection results without using object proposals

03

Demonstrated effective top-down reasoning for object localization

Abstract

We present an active detection model for localizing objects in scenes. The model is class-specific and allows an agent to focus attention on candidate regions for identifying the correct location of a target object. This agent learns to deform a bounding box using simple transformation actions, with the goal of determining the most specific location of target objects following top-down reasoning. The proposed localization agent is trained using deep reinforcement learning, and evaluated on the Pascal VOC 2007 dataset. We show that agents guided by the proposed model are able to localize a single instance of an object after analyzing only between 11 and 25 regions in an image, and obtain the best detection results among systems that do not use object proposals for object localization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics