Provable General Function Class Representation Learning in Multitask Bandits and MDPs
Rui Lu, Andrew Zhao, Simon S. Du, Gao Huang

TL;DR
This paper extends theoretical analysis of multitask representation learning to general function classes in reinforcement learning, validating benefits for bandits and MDPs, and demonstrating effectiveness with neural networks.
Contribution
It introduces a new theoretical framework for analyzing general function class representations in multitask RL, including a novel GFUCB algorithm and validation for neural network representations.
Findings
Theoretical validation of multitask representation learning benefits in general function classes.
First analysis of such benefits in linear MDPs and bandits.
Experimental evidence showing neural network representations improve performance.
Abstract
While multitask representation learning has become a popular approach in reinforcement learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it works is still limited. Most previous analytical works could only assume that the representation function is already known to the agent or from linear function class, since analyzing general function class representation encounters non-trivial technical obstacles such as generalization guarantee, formulation of confidence bound in abstract function space, etc. However, linear-case analysis heavily relies on the particularity of linear function class, while real-world practice usually adopts general non-linear representation functions like neural networks. This significantly reduces its applicability. In this work, we extend the analysis to general function class representations. Specifically, we consider an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Receptor Mechanisms and Signaling · Reinforcement Learning in Robotics
