Target-driven Visual Navigation in Indoor Scenes using Deep   Reinforcement Learning

Yuke Zhu; Roozbeh Mottaghi; Eric Kolve; Joseph J. Lim; Abhinav Gupta,; Li Fei-Fei; Ali Farhadi

arXiv:1609.05143·cs.CV·September 19, 2016·163 cites

Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning

Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta,, Li Fei-Fei, Ali Farhadi

PDF

Open Access 2 Repos

TL;DR

This paper introduces a goal-conditioned deep reinforcement learning approach for indoor visual navigation that generalizes well across targets and scenes, utilizing an efficient simulation environment for training.

Contribution

It proposes a goal-aware actor-critic model and an AI2-THOR simulation framework to improve generalization and data efficiency in target-driven visual navigation.

Findings

01

Faster convergence than existing methods

02

Effective generalization across targets and scenes

03

Successful transfer to real robot scenarios with minimal fine-tuning

Abstract

Two less addressed issues of deep reinforcement learning are (1) lack of generalization capability to new target goals, and (2) data inefficiency i.e., the model requires several (and often costly) episodes of trial and error to converge, which makes it impractical to be applied to real-world scenarios. In this paper, we address these two issues and apply our model to the task of target-driven visual navigation. To address the first issue, we propose an actor-critic model whose policy is a function of the goal as well as the current state, which allows to better generalize. To address the second issue, we propose AI2-THOR framework, which provides an environment with high-quality 3D scenes and physics engine. Our framework enables agents to take actions and interact with objects. Hence, we can collect a huge number of training samples efficiently. We show that our proposed method (1)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning