Meta Reinforcement Learning with Finite Training Tasks -- a Density Estimation Approach
Zohar Rimon, Aviv Tamar, Gilad Adler

TL;DR
This paper introduces a density estimation approach for meta reinforcement learning, enabling the agent to learn the task distribution directly and achieve better bounds, especially in low-dimensional manifold settings.
Contribution
It proposes a novel density estimation-based method for meta RL that improves theoretical bounds and practical performance by leveraging task distribution structure.
Findings
Bounds depend on task distribution dimension
Dimensionality reduction improves bounds in low-dimensional manifolds
Regularization via kernel density estimation enhances practical performance
Abstract
In meta reinforcement learning (meta RL), an agent learns from a set of training tasks how to quickly solve a new task, drawn from the same task distribution. The optimal meta RL policy, a.k.a. the Bayes-optimal behavior, is well defined, and guarantees optimal reward in expectation, taken with respect to the task distribution. The question we explore in this work is how many training tasks are required to guarantee approximately optimal behavior with high probability. Recent work provided the first such PAC analysis for a model-free setting, where a history-dependent policy was learned from the training tasks. In this work, we propose a different approach: directly learn the task distribution, using density estimation techniques, and then train a policy on the learned task distribution. We show that our approach leads to bounds that depend on the dimension of the task distribution. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
