SplAgger: Split Aggregation for Meta-Reinforcement Learning

Jacob Beck; Matthew Jackson; Risto Vuorio; Zheng Xiong; Shimon; Whiteson

arXiv:2403.03020·cs.LG·June 4, 2024·1 cites

SplAgger: Split Aggregation for Meta-Reinforcement Learning

Jacob Beck, Matthew Jackson, Risto Vuorio, Zheng Xiong, Shimon, Whiteson

PDF

Open Access 1 Repo

TL;DR

SplAgger introduces a novel meta-reinforcement learning approach combining permutation invariant and variant sequence models to enhance rapid learning in new tasks, outperforming existing methods in continuous control and memory environments.

Contribution

The paper proposes SplAgger, a new method that integrates permutation invariant and variant sequence models for meta-RL, demonstrating improved performance over baselines.

Findings

01

Permutation invariant sequence models are beneficial even without task inference objectives.

02

SplAgger outperforms all baselines on continuous control and memory tasks.

03

Multiple conditions under which permutation variance remains useful are identified.

Abstract

A core ambition of reinforcement learning (RL) is the creation of agents capable of rapid learning in novel tasks. Meta-RL aims to achieve this by directly learning such agents. Black box methods do so by training off-the-shelf sequence models end-to-end. By contrast, task inference methods explicitly infer a posterior distribution over the unknown task, typically using distinct objectives and sequence models designed to enable task inference. Recent work has shown that task inference methods are not necessary for strong performance. However, it remains unclear whether task inference sequence models are beneficial even when task inference objectives are not. In this paper, we present evidence that task inference sequence models are indeed still beneficial. In particular, we investigate sequence models with permutation invariant aggregation, which exploit the fact that, due to the Markov…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jacooba/hyper
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Modular Robots and Swarm Intelligence