Accelerated Sim-to-Real Deep Reinforcement Learning: Learning Collision   Avoidance from Human Player

Hanlin Niu; Ze Ji; Farshad Arvin; Barry Lennox; Hujun Yin; and Joaquin; Carrasco

arXiv:2102.10711·cs.AI·February 24, 2021·1 cites

Accelerated Sim-to-Real Deep Reinforcement Learning: Learning Collision Avoidance from Human Player

Hanlin Niu, Ze Ji, Farshad Arvin, Barry Lennox, Hujun Yin, and Joaquin, Carrasco

PDF

Open Access 1 Repo

TL;DR

This paper introduces a sensor-level, mapless collision avoidance algorithm for mobile robots that learns efficiently from human and self-exploratory data, achieving faster training and robust real-world performance without fine-tuning.

Contribution

It proposes a novel training strategy combining human experience and self-exploration with prioritized replay, enabling rapid learning and transfer from simulation to real robots.

Findings

01

Achieved similar rewards with significantly fewer training steps than standard DDPG.

02

No collisions in less than 2-2.5 hours of training in simulated environments.

03

Successfully transferred the trained model to real robots without additional fine-tuning.

Abstract

This paper presents a sensor-level mapless collision avoidance algorithm for use in mobile robots that map raw sensor data to linear and angular velocities and navigate in an unknown environment without a map. An efficient training strategy is proposed to allow a robot to learn from both human experience data and self-exploratory data. A game format simulation framework is designed to allow the human player to tele-operate the mobile robot to a goal and human action is also scored using the reward function. Both human player data and self-playing data are sampled using prioritized experience replay algorithm. The proposed algorithm and training strategy have been evaluated in two different experimental configurations: \textit{Environment 1}, a simulated cluttered environment, and \textit{Environment 2}, a simulated corridor environment, to investigate the performance. It was…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hanlinniu/turtlebot3_ddpg_collision_avoidance
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Human Pose and Action Recognition

MethodsAdam · Convolution · Experience Replay · *Communicated@Fast*How Do I Communicate to Expedia? · Weight Decay · Prioritized Experience Replay · Batch Normalization · Dense Connections · Deep Deterministic Policy Gradient