Bilinear Classes: A Structural Framework for Provable Generalization in RL
Simon S. Du, Sham M. Kakade, Jason D. Lee, Shachar Lovett, Gaurav, Mahajan, Wen Sun, Ruosong Wang

TL;DR
This paper introduces Bilinear Classes, a structural framework enabling provable generalization in reinforcement learning with function approximation, covering existing models and new ones like the Linear Q*/V* model, with polynomial sample complexity bounds.
Contribution
The paper defines Bilinear Classes, providing a unified framework for RL generalization, and develops algorithms with near-optimal polynomial sample complexity, extending to infinite-dimensional settings.
Findings
Polynomial sample complexity for Bilinear Classes.
Framework includes new models like Linear Q*/V*.
Sample bounds depend on information-theoretic quantities.
Abstract
This work introduces Bilinear Classes, a new structural framework, which permit generalization in reinforcement learning in a wide variety of settings through the use of function approximation. The framework incorporates nearly all existing models in which a polynomial sample complexity is achievable, and, notably, also includes new models, such as the Linear model in which both the optimal -function and the optimal -function are linear in some known feature space. Our main result provides an RL algorithm which has polynomial sample complexity for Bilinear Classes; notably, this sample complexity is stated in terms of a reduction to the generalization error of an underlying supervised learning sub-problem. These bounds nearly match the best known sample complexity bounds for existing models. Furthermore, this framework also extends to the infinite dimensional (RKHS)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Machine Learning and Algorithms
