Fast Non-Episodic Adaptive Tuning of Robot Controllers with Online Policy Optimization

James A. Preiss; Fengze Xie; Yiheng Lin; Adam Wierman; and Yisong Yue

arXiv:2507.10914·cs.RO·July 16, 2025

Fast Non-Episodic Adaptive Tuning of Robot Controllers with Online Policy Optimization

James A. Preiss, Fengze Xie, Yiheng Lin, Adam Wierman, and Yisong Yue

PDF

Open Access

TL;DR

This paper introduces M-GAPS, an online policy optimization algorithm for real-time tuning of robot controllers that adapts quickly to changing dynamics without relying on episodes, demonstrated on quadrotors and cars.

Contribution

The paper presents M-GAPS, a practical, model-based online policy optimization method for continuous, non-episodic robot control, improving adaptation speed and robustness.

Findings

01

M-GAPS outperforms model-based and model-free baselines in speed and accuracy.

02

M-GAPS adapts rapidly to wind and payload disturbances.

03

The approach is effective on quadrotors and scaled cars.

Abstract

We study online algorithms to tune the parameters of a robot controller in a setting where the dynamics, policy class, and optimality objective are all time-varying. The system follows a single trajectory without episodes or state resets, and the time-varying information is not known in advance. Focusing on nonlinear geometric quadrotor controllers as a test case, we propose a practical implementation of a single-trajectory model-based online policy optimization algorithm, M-GAPS,along with reparameterizations of the quadrotor state space and policy class to improve the optimization landscape. In hardware experiments,we compare to model-based and model-free baselines that impose artificial episodes. We show that M-GAPS finds near-optimal parameters more quickly, especially when the episode length is not favorable. We also show that M-GAPS rapidly adapts to heavy unmodeled wind and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIterative Learning Control Systems · Adaptive Dynamic Programming Control · Reinforcement Learning in Robotics