DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion   Frames

Erik Wijmans; Abhishek Kadian; Ari Morcos; Stefan Lee; Irfan Essa,; Devi Parikh; Manolis Savva; Dhruv Batra

arXiv:1911.00357·cs.CV·January 22, 2020·36 cites

DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames

Erik Wijmans, Abhishek Kadian, Ari Morcos, Stefan Lee, Irfan Essa,, Devi Parikh, Manolis Savva, Dhruv Batra

PDF

Open Access 5 Repos

TL;DR

This paper introduces DD-PPO, a scalable distributed reinforcement learning method that trained a near-perfect point-goal navigation agent using 2.5 billion frames, achieving state-of-the-art results efficiently.

Contribution

The paper presents DD-PPO, a simple, scalable, and decentralized distributed RL algorithm enabling massive-scale training for embodied AI navigation tasks.

Findings

01

Achieved 107x speedup on 128 GPUs over serial implementation.

02

Trained an agent with 2.5 billion steps, equivalent to 80 years of human experience.

03

Set new state-of-the-art on Habitat Autonomous Navigation Challenge 2019.

Abstract

We present Decentralized Distributed Proximal Policy Optimization (DD-PPO), a method for distributed reinforcement learning in resource-intensive simulated environments. DD-PPO is distributed (uses multiple machines), decentralized (lacks a centralized server), and synchronous (no computation is ever stale), making it conceptually simple and easy to implement. In our experiments on training virtual robots to navigate in Habitat-Sim, DD-PPO exhibits near-linear scaling -- achieving a speedup of 107x on 128 GPUs over a serial implementation. We leverage this scaling to train an agent for 2.5 Billion steps of experience (the equivalent of 80 years of human experience) -- over 6 months of GPU-time training in under 3 days of wall-clock time with 64 GPUs. This massive-scale training not only sets the state of art on Habitat Autonomous Navigation Challenge 2019, but essentially solves the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications

MethodsDecentralized Distributed Proximal Policy Optimization