StROL: Stabilized and Robust Online Learning from Humans

Shaunak A. Mehta; Forrest Meng; Andrea Bajcsy; and Dylan P. Losey

arXiv:2308.09863·cs.RO·January 5, 2024·2 cites

StROL: Stabilized and Robust Online Learning from Humans

Shaunak A. Mehta, Forrest Meng, Andrea Bajcsy, and Dylan P. Losey

PDF

Open Access 1 Repo

TL;DR

StROL introduces a stability-focused online learning algorithm that adapts gradient descent rules to reliably infer human preferences in noisy and suboptimal interaction scenarios, improving accuracy and reducing regret.

Contribution

The paper presents a novel Lyapunov stability analysis-based method to modify gradient descent learning rules, enhancing robustness and convergence in online human reward inference.

Findings

01

StROL achieves more accurate reward inference in simulations.

02

StROL reduces regret compared to existing methods.

03

The approach maintains stability even with noisy and biased human inputs.

Abstract

Robots often need to learn the human's reward function online, during the current interaction. This real-time learning requires fast but approximate learning rules: when the human's behavior is noisy or suboptimal, current approximations can result in unstable robot learning. Accordingly, in this paper we seek to enhance the robustness and convergence properties of gradient descent learning rules when inferring the human's reward parameters. We model the robot's learning algorithm as a dynamical system over the human preference parameters, where the human's true (but unknown) preferences are the equilibrium point. This enables us to perform Lyapunov stability analysis to derive the conditions under which the robot's learning dynamics converge. Our proposed algorithm (StROL) uses these conditions to learn robust-by-design learning rules: given the original learning dynamics, StROL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vt-collab/strol_ral
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Functional Brain Connectivity Studies