Thompson Sampling-Based Learning and Control for Unknown Dynamic Systems
Kaikai Zheng, Dawei Shi, Yang Shi, Long Wang

TL;DR
This paper introduces a novel Thompson sampling-based method for learning and controlling unknown nonlinear systems using reproducing kernel Hilbert spaces, providing theoretical guarantees and demonstrating effectiveness through numerical experiments.
Contribution
It develops a kernel-based control law learning framework with a Thompson sampling approach, extending applicability to general function spaces in control system design.
Findings
The method achieves exponential convergence in learning control laws.
The approach provides bounds on control regret.
Numerical experiments validate the method's effectiveness.
Abstract
Thompson sampling (TS) is a Bayesian randomized exploration strategy that samples options (e.g., system parameters or control laws) from the current posterior and then applies the selected option that is optimal for a task, thereby balancing exploration and exploitation; this makes TS effective for active learning-based controller design. However, TS relies on finite parametric representations, which limits its applicability to more general spaces, which are more commonly encountered in control system design. To address this issue, this work proposes a parameterization method for control law learning using reproducing kernel Hilbert spaces and designs a data-driven active learning control approach. Specifically, the proposed method treats the control law as an element in a function space, allowing the design of control laws without imposing restrictions on the system structure or the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Control Systems and Identification · Gaussian Processes and Bayesian Inference
