MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL

Fei Ni; Jianye Hao; Yao Mu; Yifu Yuan; Yan Zheng; Bin Wang; Zhixuan; Liang

arXiv:2305.19923·cs.LG·June 1, 2023·2 cites

MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL

Fei Ni, Jianye Hao, Yao Mu, Yifu Yuan, Yan Zheng, Bin Wang, Zhixuan, Liang

PDF

Open Access 1 Video

TL;DR

MetaDiffuser introduces a task-conditioned diffusion model for offline meta-reinforcement learning, enabling better generalization across tasks with diverse dynamics and rewards, and improving trajectory planning robustness.

Contribution

It proposes a novel task-conditioned diffusion framework with a dual-guided sampling module for offline meta-RL, enhancing generalization and trajectory quality.

Findings

01

Outperforms strong offline meta-RL baselines on MuJoCo benchmarks.

02

Demonstrates robust generalization across diverse tasks.

03

Shows improved trajectory planning and return optimization.

Abstract

Recently, diffusion model shines as a promising backbone for the sequence modeling paradigm in offline reinforcement learning(RL). However, these works mostly lack the generalization ability across tasks with reward or dynamics change. To tackle this challenge, in this paper we propose a task-oriented conditioned diffusion planner for offline meta-RL(MetaDiffuser), which considers the generalization problem as conditional trajectory generation task with contextual representation. The key is to learn a context conditioned diffusion model which can generate task-oriented trajectories for planning across diverse tasks. To enhance the dynamics consistency of the generated trajectories while encouraging trajectories to achieve high returns, we further design a dual-guided module in the sampling process of the diffusion model. The proposed framework enjoys the robustness to the quality of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Locomotion and Control · Evolutionary Algorithms and Applications

MethodsDiffusion