A model-based approach to meta-Reinforcement Learning: Transformers and   tree search

Brieuc Pinon; Jean-Charles Delvenne; Rapha\"el Jungers

arXiv:2208.11535·cs.LG·August 25, 2022·1 cites

A model-based approach to meta-Reinforcement Learning: Transformers and tree search

Brieuc Pinon, Jean-Charles Delvenne, Rapha\"el Jungers

PDF

Open Access

TL;DR

This paper introduces a model-based meta-RL approach using Transformers and tree search, demonstrating superior exploration and exploitation capabilities in the Alchemy benchmark compared to model-free methods.

Contribution

It develops a novel model-based meta-RL algorithm with a Transformer encoder for environment dynamics and online planning, advancing exploration strategies.

Findings

01

Outperforms previous model-free RL methods on Alchemy benchmark

02

Shows Transformer effectively models complex latent space dynamics

03

Highlights the importance of model-based planning in meta-RL

Abstract

Meta-learning is a line of research that develops the ability to leverage past experiences to efficiently solve new learning problems. Meta-Reinforcement Learning (meta-RL) methods demonstrate a capability to learn behaviors that efficiently acquire and exploit information in several meta-RL problems. In this context, the Alchemy benchmark has been proposed by Wang et al. [2021]. Alchemy features a rich structured latent space that is challenging for state-of-the-art model-free RL methods. These methods fail to learn to properly explore then exploit. We develop a model-based algorithm. We train a model whose principal block is a Transformer Encoder to fit the symbolic Alchemy environment dynamics. Then we define an online planner with the learned model using a tree search method. This algorithm significantly outperforms previously applied model-free RL methods on the symbolic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Dense Connections · Residual Connection · Dropout · Label Smoothing · Softmax