Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part II
Yi Tian, Kaiqing Zhang, Russ Tedrake, Suvrit Sra

TL;DR
This paper develops cost-driven methods for learning state representations in control tasks, providing finite-sample guarantees for near-optimal control in high-dimensional LQG systems, and introduces new theoretical insights into latent dynamics learning.
Contribution
It introduces two approaches to cost-driven representation learning for LQG control, including an implicit dynamics method similar to MuZero, with finite-sample guarantees and new technical analysis.
Findings
Finite-sample guarantees for near-optimal control using learned representations.
Analysis of implicit latent dynamics learning akin to MuZero.
Proof of persistency of excitation for a new stochastic process.
Abstract
We study the problem of state representation learning for control from partial and potentially high-dimensional observations. We approach this problem via cost-driven state representation learning, in which we learn a dynamical model in a latent state space by predicting cumulative costs. In particular, we establish finite-sample guarantees on finding a near-optimal representation function and a near-optimal controller using the learned latent model for infinite-horizon time-invariant Linear Quadratic Gaussian (LQG) control. We study two approaches to cost-driven representation learning, which differ in whether the transition function of the latent state is learned explicitly or implicitly. The first approach has also been investigated in Part I of this work, for finite-horizon time-varying LQG control. The second approach closely resembles MuZero, a recent breakthrough in empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics
