Parallel Reinforcement Learning Simulation for Visual Quadrotor   Navigation

Jack Saunders; Sajad Saeedi; Wenbin Li

arXiv:2209.11094·cs.RO·October 28, 2024

Parallel Reinforcement Learning Simulation for Visual Quadrotor Navigation

Jack Saunders, Sajad Saeedi, Wenbin Li

PDF

Open Access

TL;DR

This paper introduces a parallel simulation framework based on AirSim that significantly accelerates reinforcement learning training for visual quadrotor navigation by utilizing multiple networked computers and agents.

Contribution

It presents a novel parallel training framework for RL in quadrotor navigation, reducing training time by leveraging decentralised AirSim environments and multiple agents.

Findings

01

Training time reduced from 3.9 hours to 11 minutes.

02

Utilized 74 agents across two networked computers.

03

Achieved efficient parallel RL training for complex visual navigation tasks.

Abstract

Reinforcement learning (RL) is an agent-based approach for teaching robots to navigate within the physical world. Gathering data for RL is known to be a laborious task, and real-world experiments can be risky. Simulators facilitate the collection of training data in a quicker and more cost-effective manner. However, RL frequently requires a significant number of simulation steps for an agent to become skilful at simple tasks. This is a prevalent issue within the field of RL-based visual quadrotor navigation where state dimensions are typically very large and dynamic models are complex. Furthermore, rendering images and obtaining physical properties of the agent can be computationally expensive. To solve this, we present a simulation framework, built on AirSim, which provides efficient parallel training. Building on this framework, Ape-X is modified to incorporate decentralised training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvacuation and Crowd Dynamics · Robotic Path Planning Algorithms · UAV Applications and Optimization

MethodsPrioritized Experience Replay · Ape-X