Regularizing Action Policies for Smooth Control with Reinforcement   Learning

Siddharth Mysore; Bassel Mabsout; Renato Mancuso; Kate Saenko

arXiv:2012.06644·cs.RO·May 28, 2021

Regularizing Action Policies for Smooth Control with Reinforcement Learning

Siddharth Mysore, Bassel Mabsout, Renato Mancuso, Kate Saenko

PDF

TL;DR

This paper introduces CAPS, a regularization method for deep RL controllers that significantly improves action smoothness, reduces power consumption, and enhances real-world control performance, demonstrated on a quadrotor drone.

Contribution

We propose CAPS, a novel regularization technique that enhances the smoothness of RL policies, leading to better control and efficiency in real-world systems.

Findings

01

80% reduction in power consumption for drone control

02

Significant elimination of high-frequency oscillations in control signals

03

Consistent training of flight-worthy controllers

Abstract

A critical problem with the practical utility of controllers trained with deep Reinforcement Learning (RL) is the notable lack of smoothness in the actions learned by the RL policies. This trend often presents itself in the form of control signal oscillation and can result in poor control, high power consumption, and undue system wear. We introduce Conditioning for Action Policy Smoothness (CAPS), an effective yet intuitive regularization on action policies, which offers consistent improvement in the smoothness of the learned state-to-action mappings of neural network controllers, reflected in the elimination of high-frequency components in the control signal. Tested on a real system, improvements in controller smoothness on a quadrotor drone resulted in an almost 80% reduction in power consumption while consistently training flight-worthy controllers. Project website:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.