Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models
Qi Wang, Herke van Hoof

TL;DR
This paper introduces a graph-structured surrogate model for model-based meta reinforcement learning, improving dynamics prediction and enabling fast, high-return decision-making across tasks.
Contribution
It proposes a novel GSSM that enhances dynamics modeling and integrates a Thompson-sampling approach for efficient meta RL.
Findings
GSSM outperforms existing models in environment dynamics prediction.
The approach achieves high returns without test-time policy gradient optimization.
It enables fast deployment with improved generalization across tasks.
Abstract
Reinforcement learning is a promising paradigm for solving sequential decision-making problems, but low data efficiency and weak generalization across tasks are bottlenecks in real-world applications. Model-based meta reinforcement learning addresses these issues by learning dynamics and leveraging knowledge from prior experience. In this paper, we take a closer look at this framework, and propose a new Thompson-sampling based approach that consists of a new model to identify task dynamics together with an amortized policy optimization step. We show that our model, called a graph structured surrogate model (GSSM), outperforms state-of-the-art methods in predicting environment dynamics. Additionally, our approach is able to obtain high returns, while allowing fast execution during deployment by avoiding test time policy gradient optimization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Evolutionary Algorithms and Applications
