Learning Efficiently Function Approximation for Contextual MDP

Orin Levy; Yishay Mansour

arXiv:2203.00995·cs.LG·December 1, 2022·1 cites

Learning Efficiently Function Approximation for Contextual MDP

Orin Levy, Yishay Mansour

PDF

Open Access

TL;DR

This paper develops a method for efficiently learning function approximations in contextual MDPs, providing polynomial complexity bounds and reducing the problem to supervised learning.

Contribution

It introduces a unified approach for learning in contextual MDPs with both dependent and independent dynamics, with theoretical guarantees.

Findings

01

Polynomial sample complexity for both models

02

Reduction of learning contextual MDPs to supervised learning

03

Applicable to models with dependent and independent dynamics

Abstract

We study learning contextual MDPs using a function approximation for both the rewards and the dynamics. We consider both the case that the dynamics dependent or independent of the context. For both models we derive polynomial sample and time complexity (assuming an efficient ERM oracle). Our methodology gives a general reduction from learning contextual MDP to supervised learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Neural Networks and Applications