Decomposing Control Lyapunov Functions for Efficient Reinforcement Learning
Antonio Lopez, David Fridovich-Keil

TL;DR
This paper introduces Decomposed Control Lyapunov Functions (DCLFs) to improve reinforcement learning efficiency in high-dimensional systems by enabling reward shaping, demonstrated through quadcopter landing tasks with reduced data requirements.
Contribution
It proposes a novel system decomposition method to compute DCLFs, facilitating reward shaping in RL for high-dimensional systems where traditional CLF methods are intractable.
Findings
DCLF-based reward shaping improves RL performance.
Our method reduces real-world data needed for training.
Successful quadcopter landing with less than half the data of existing methods.
Abstract
Recent methods using Reinforcement Learning (RL) have proven to be successful for training intelligent agents in unknown environments. However, RL has not been applied widely in real-world robotics scenarios. This is because current state-of-the-art RL methods require large amounts of data to learn a specific task, leading to unreasonable costs when deploying the agent to collect data in real-world applications. In this paper, we build from existing work that reshapes the reward function in RL by introducing a Control Lyapunov Function (CLF), which is demonstrated to reduce the sample complexity. Still, this formulation requires knowing a CLF of the system, but due to the lack of a general method, it is often a challenge to identify a suitable CLF. Existing work can compute low-dimensional CLFs via a Hamilton-Jacobi reachability procedure. However, this class of methods becomes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Adaptive Dynamic Programming Control
