Equivariant Reinforcement Learning for Quadrotor UAV

Beomyeol Yu; Taeyoung Lee

arXiv:2206.01233·cs.LG·February 28, 2023

Equivariant Reinforcement Learning for Quadrotor UAV

Beomyeol Yu, Taeyoung Lee

PDF

Open Access

TL;DR

This paper introduces an equivariant reinforcement learning framework for quadrotor UAVs that leverages symmetry properties to reduce training complexity and improve sample efficiency, demonstrated with TD3 and SAC algorithms.

Contribution

The paper proposes a novel equivariant RL approach exploiting quadrotor dynamics symmetry, reducing state dimensions and enhancing training efficiency.

Findings

01

Significant reduction in training samples needed for effective learning.

02

Improved sample efficiency demonstrated with TD3 and SAC algorithms.

03

The equivariant framework enhances RL applicability in resource-limited scenarios.

Abstract

This paper presents an equivariant reinforcement learning framework for quadrotor unmanned aerial vehicles. Successful training of reinforcement learning often requires numerous interactions with the environments, which hinders its applicability especially when the available computational resources are limited, or when there is no reliable simulation model. We identified an equivariance property of the quadrotor dynamics such that the dimension of the state required in the training is reduced by one, thereby improving the sampling efficiency of reinforcement learning substantially. This is illustrated by numerical examples with popular reinforcement learning techniques of TD3 and SAC.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Adaptive Control of Nonlinear Systems

MethodsTarget Policy Smoothing · Clipped Double Q-learning · Convolution · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Average Pooling · Global Average Pooling · Dilated Convolution · Adam