Reinforcement Learning Control of Robotic Knee with Human in the Loop by Flexible Policy Iteration
Xiang Gao, Jennie Si, Yue Wen, Minhan Li, He (Helen) Huang

TL;DR
This paper introduces a flexible policy iteration method for reinforcement learning control of a robotic knee with human interaction, ensuring stability and near-optimal performance through simulation validation.
Contribution
It develops a novel flexible policy iteration algorithm that integrates experience replay and prior knowledge, providing performance guarantees for human-robot systems.
Findings
Proves convergence of the approximate value function.
Demonstrates system stability and near-optimality.
Validates effectiveness through realistic simulations.
Abstract
We are motivated by the real challenges presented in a human-robot system to develop new designs that are efficient at data level and with performance guarantees such as stability and optimality at systems level. Existing approximate/adaptive dynamic programming (ADP) results that consider system performance theoretically are not readily providing practically useful learning control algorithms for this problem; and reinforcement learning (RL) algorithms that address the issue of data efficiency usually do not have performance guarantees for the controlled system. This study fills these important voids by introducing innovative features to the policy iteration algorithm. We introduce flexible policy iteration (FPI), which can flexibly and organically integrate experience replay and supplemental values from prior experience into the RL controller. We show system level performances…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Mechanical Circulatory Support Devices · Reinforcement Learning in Robotics
MethodsExperience Replay
