CLF-RL: Control Lyapunov Function Guided Reinforcement Learning
Kejun Li, Zachary Olkin, Yisong Yue, Aaron D. Ames

TL;DR
This paper introduces CLF-RL, a reinforcement learning framework that uses control Lyapunov functions and model-based trajectory planning to improve the robustness and training efficiency of bipedal robot locomotion policies.
Contribution
The paper presents a novel reward shaping method combining CLFs and model-based planning, enhancing robustness and simplifying reward design in RL for legged robots.
Findings
CLF-RL outperforms baseline RL in robustness during simulation.
CLF-RL achieves better real-world performance on a Unitree G1 robot.
The method provides meaningful intermediate rewards during training.
Abstract
Reinforcement learning (RL) has shown promise in generating robust locomotion policies for bipedal robots, but often suffers from tedious reward design and sensitivity to poorly shaped objectives. In this work, we propose a structured reward shaping framework that leverages model-based trajectory generation and control Lyapunov functions (CLFs) to guide policy learning. We explore two model-based planners for generating reference trajectories: a reduced-order linear inverted pendulum (LIP) model for velocity-conditioned motion planning, and a precomputed gait library based on hybrid zero dynamics (HZD) using full-order dynamics. These planners define desired end-effector and joint trajectories, which are used to construct CLF-based rewards that penalize tracking error and encourage rapid convergence. This formulation provides meaningful intermediate rewards, and is straightforward to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
