Multitask LQG Control: Performance and Generalization Bounds
Leonardo F. Toso, Kasra Fallah, Charis Stamouli, George J. Pappas, and James Anderson

TL;DR
This paper analyzes multitask learning for LQG control, providing performance and generalization bounds, and showing variance reduction in policy gradient methods with multiple tasks.
Contribution
It introduces a bisimulation-based framework for analyzing heterogeneity in multitask LQG control and derives new guarantees for policy gradient methods.
Findings
Learning a common lifted controller induces heterogeneity bias.
Performance and generalization bounds depend on bisimulation measures.
Multitask learning reduces policy gradient variance proportionally to task number.
Abstract
We study multitask learning for stochastic and partially observed control systems, focusing on the linear quadratic Gaussian (LQG) problem. Our goal is to learn a common stabilizing controller that generalizes across a distribution of systems and objectives. To this end, we leverage a history-dependent lifting that recasts the multitask LQG problem into an equivalent high-dimensional multitask LQR problem, allowing for the analysis of policy gradient methods. We show that learning a common lifted controller induces a heterogeneity bias which we characterize via a "bisimulation function". We establish performance and generalization guarantees that explicitly depend on such bisimulation-based heterogeneity measures. For model-free, we demonstrate that multitask learning reduces policy gradient estimation variance proportionally to the number of tasks in the training set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
