Reinforcement Learning Control of Robotic Knee with Human in the Loop by   Flexible Policy Iteration

Xiang Gao; Jennie Si; Yue Wen; Minhan Li; He (Helen) Huang

arXiv:2006.09008·eess.SY·January 19, 2021·1 cites

Reinforcement Learning Control of Robotic Knee with Human in the Loop by Flexible Policy Iteration

Xiang Gao, Jennie Si, Yue Wen, Minhan Li, He (Helen) Huang

PDF

Open Access

TL;DR

This paper introduces a flexible policy iteration method for reinforcement learning control of a robotic knee with human interaction, ensuring stability and near-optimal performance through simulation validation.

Contribution

It develops a novel flexible policy iteration algorithm that integrates experience replay and prior knowledge, providing performance guarantees for human-robot systems.

Findings

01

Proves convergence of the approximate value function.

02

Demonstrates system stability and near-optimality.

03

Validates effectiveness through realistic simulations.

Abstract

We are motivated by the real challenges presented in a human-robot system to develop new designs that are efficient at data level and with performance guarantees such as stability and optimality at systems level. Existing approximate/adaptive dynamic programming (ADP) results that consider system performance theoretically are not readily providing practically useful learning control algorithms for this problem; and reinforcement learning (RL) algorithms that address the issue of data efficiency usually do not have performance guarantees for the controlled system. This study fills these important voids by introducing innovative features to the policy iteration algorithm. We introduce flexible policy iteration (FPI), which can flexibly and organically integrate experience replay and supplemental values from prior experience into the RL controller. We show system level performances…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Mechanical Circulatory Support Devices · Reinforcement Learning in Robotics

MethodsExperience Replay