On Approximate Dynamic Programming with Multivariate Splines for Adaptive Control
Willem Eerland, Coen de Visser, Erik-Jan van Kampen

TL;DR
This paper introduces a new SDP framework using multivariate simplex B-splines and a modified RLSTD algorithm with a local forget factor, enabling better tracking of time-varying systems in adaptive control.
Contribution
It presents a novel SDP approach with a local forget factor integrated into RLSTD, improving stability and learning speed for adaptive control of dynamic systems.
Findings
SDP with multivariate splines outperforms NDP in stability and learning rate.
Modified RLSTD enables faster adaptation to system changes.
SDP increases computational load but offers better control performance.
Abstract
We define a SDP framework based on the RLSTD algorithm and multivariate simplex B-splines. We introduce a local forget factor capable of preserving the continuity of the simplex splines. This local forget factor is integrated with the RLSTD algorithm, resulting in a modified RLSTD algorithm that is capable of tracking time-varying systems. We present the results of two numerical experiments, one validating SDP and comparing it with NDP and another to show the advantages of the modified RLSTD algorithm over the original. While SDP requires more computations per time-step, the experiment shows that for the same amount of function approximator parameters, there is an increase in performance in terms of stability and learning rate compared to NDP. The second experiment shows that SDP in combination with the modified RLSTD algorithm allows for faster recovery compared to the original RLSTD…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Advanced Control Systems Optimization
