A Safety Modulator Actor-Critic Method in Model-Free Safe Reinforcement   Learning and Application in UAV Hovering

Qihan Qi; Xinsong Yang; Gang Xia; Daniel W. C. Ho; Pengyang Tang

arXiv:2410.06847·cs.AI·October 10, 2024

A Safety Modulator Actor-Critic Method in Model-Free Safe Reinforcement Learning and Application in UAV Hovering

Qihan Qi, Xinsong Yang, Gang Xia, Daniel W. C. Ho, Pengyang Tang

PDF

Open Access

TL;DR

This paper introduces the SMAC method, combining a safety modulator and distributional critic, to enhance safety and performance in model-free safe reinforcement learning for UAV hovering tasks.

Contribution

It presents a novel safety modulator actor-critic framework with a distributional critic, specifically designed to improve safety constraint satisfaction and reduce overestimation in safe RL.

Findings

01

SMAC effectively maintains safety constraints during UAV hovering.

02

SMAC outperforms baseline algorithms in safety and reward metrics.

03

Experimental results validate the method in both simulation and real-world scenarios.

Abstract

This paper proposes a safety modulator actor-critic (SMAC) method to address safety constraint and overestimation mitigation in model-free safe reinforcement learning (RL). A safety modulator is developed to satisfy safety constraints by modulating actions, allowing the policy to ignore safety constraint and focus on maximizing reward. Additionally, a distributional critic with a theoretical update rule for SMAC is proposed to mitigate the overestimation of Q-values with safety constraints. Both simulation and real-world scenarios experiments on Unmanned Aerial Vehicles (UAVs) hovering confirm that the SMAC can effectively maintain safety constraints and outperform mainstream baseline algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems · Safety Systems Engineering in Autonomy

MethodsFocus