Meta-Reinforcement Learning via Exploratory Task Clustering
Zhendong Chu, Hongning Wang

TL;DR
This paper introduces a meta-reinforcement learning method that uses task clustering to exploit structured heterogeneity among tasks, improving sample efficiency and adaptation speed.
Contribution
It proposes a novel exploratory policy for task clustering in meta-RL, enabling better knowledge sharing and more efficient policy adaptation.
Findings
Effectively uncovers task clusters in rewards and dynamics
Achieves superior sample efficiency over baselines
Demonstrates robustness across MuJoCo tasks
Abstract
Meta-reinforcement learning (meta-RL) aims to quickly solve new tasks by leveraging knowledge from prior tasks. However, previous studies often assume a single mode homogeneous task distribution, ignoring possible structured heterogeneity among tasks. Leveraging such structures can better facilitate knowledge sharing among related tasks and thus improve sample efficiency. In this paper, we explore the structured heterogeneity among tasks via clustering to improve meta-RL. We develop a dedicated exploratory policy to discover task structures via divide-and-conquer. The knowledge of the identified clusters helps to narrow the search space of task-specific information, leading to more sample efficient policy adaptation. Experiments on various MuJoCo tasks showed the proposed method can unravel cluster structures effectively in both rewards and state dynamics, proving strong advantages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Multi-Objective Optimization Algorithms · Reinforcement Learning in Robotics
