Fast Non-Episodic Adaptive Tuning of Robot Controllers with Online Policy Optimization
James A. Preiss, Fengze Xie, Yiheng Lin, Adam Wierman, and Yisong Yue

TL;DR
This paper introduces M-GAPS, an online policy optimization algorithm for real-time tuning of robot controllers that adapts quickly to changing dynamics without relying on episodes, demonstrated on quadrotors and cars.
Contribution
The paper presents M-GAPS, a practical, model-based online policy optimization method for continuous, non-episodic robot control, improving adaptation speed and robustness.
Findings
M-GAPS outperforms model-based and model-free baselines in speed and accuracy.
M-GAPS adapts rapidly to wind and payload disturbances.
The approach is effective on quadrotors and scaled cars.
Abstract
We study online algorithms to tune the parameters of a robot controller in a setting where the dynamics, policy class, and optimality objective are all time-varying. The system follows a single trajectory without episodes or state resets, and the time-varying information is not known in advance. Focusing on nonlinear geometric quadrotor controllers as a test case, we propose a practical implementation of a single-trajectory model-based online policy optimization algorithm, M-GAPS,along with reparameterizations of the quadrotor state space and policy class to improve the optimization landscape. In hardware experiments,we compare to model-based and model-free baselines that impose artificial episodes. We show that M-GAPS finds near-optimal parameters more quickly, especially when the episode length is not favorable. We also show that M-GAPS rapidly adapts to heavy unmodeled wind and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIterative Learning Control Systems · Adaptive Dynamic Programming Control · Reinforcement Learning in Robotics
