Regularizing Action Policies for Smooth Control with Reinforcement Learning
Siddharth Mysore, Bassel Mabsout, Renato Mancuso, Kate Saenko

TL;DR
This paper introduces CAPS, a regularization method for deep RL controllers that significantly improves action smoothness, reduces power consumption, and enhances real-world control performance, demonstrated on a quadrotor drone.
Contribution
We propose CAPS, a novel regularization technique that enhances the smoothness of RL policies, leading to better control and efficiency in real-world systems.
Findings
80% reduction in power consumption for drone control
Significant elimination of high-frequency oscillations in control signals
Consistent training of flight-worthy controllers
Abstract
A critical problem with the practical utility of controllers trained with deep Reinforcement Learning (RL) is the notable lack of smoothness in the actions learned by the RL policies. This trend often presents itself in the form of control signal oscillation and can result in poor control, high power consumption, and undue system wear. We introduce Conditioning for Action Policy Smoothness (CAPS), an effective yet intuitive regularization on action policies, which offers consistent improvement in the smoothness of the learned state-to-action mappings of neural network controllers, reflected in the elimination of high-frequency components in the control signal. Tested on a real system, improvements in controller smoothness on a quadrotor drone resulted in an almost 80% reduction in power consumption while consistently training flight-worthy controllers. Project website:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
