Bridging Model-based Safety and Model-free Reinforcement Learning through System Identification of Low Dimensional Linear Models
Zhongyu Li, Jun Zeng, Akshay Thirugnanam, Koushil Sreenath

TL;DR
This paper presents a method to combine model-based safety guarantees with model-free reinforcement learning by identifying low-dimensional linear models of complex robotic systems, demonstrated on a bipedal robot.
Contribution
It introduces a novel approach to extract linear low-dimensional models from high-dimensional nonlinear systems controlled by RL, enabling safety guarantees.
Findings
Low-dimensional models accurately capture closed-loop dynamics.
Linear models are stable and decoupled across control inputs.
Safety guarantees can be applied using model predictive control with barrier functions.
Abstract
Bridging model-based safety and model-free reinforcement learning (RL) for dynamic robots is appealing since model-based methods are able to provide formal safety guarantees, while RL-based methods are able to exploit the robot agility by learning from the full-order system dynamics. However, current approaches to tackle this problem are mostly restricted to simple systems. In this paper, we propose a new method to combine model-based safety with model-free reinforcement learning by explicitly finding a low-dimensional model of the system controlled by a RL policy and applying stability and safety guarantees on that simple model. We use a complex bipedal robot Cassie, which is a high dimensional nonlinear system with hybrid dynamics and underactuation, and its RL-based walking controller as an example. We show that a low-dimensional dynamical model is sufficient to capture the dynamics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMechanical Circulatory Support Devices · Prosthetics and Rehabilitation Robotics
