Learning Efficiently Function Approximation for Contextual MDP
Orin Levy, Yishay Mansour

TL;DR
This paper develops a method for efficiently learning function approximations in contextual MDPs, providing polynomial complexity bounds and reducing the problem to supervised learning.
Contribution
It introduces a unified approach for learning in contextual MDPs with both dependent and independent dynamics, with theoretical guarantees.
Findings
Polynomial sample complexity for both models
Reduction of learning contextual MDPs to supervised learning
Applicable to models with dependent and independent dynamics
Abstract
We study learning contextual MDPs using a function approximation for both the rewards and the dynamics. We consider both the case that the dynamics dependent or independent of the context. For both models we derive polynomial sample and time complexity (assuming an efficient ERM oracle). Our methodology gives a general reduction from learning contextual MDP to supervised learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Neural Networks and Applications
