Goal-Conditioned Imitation Learning using Score-based Diffusion Policies

Moritz Reuss; Maximilian Li; Xiaogang Jia; Rudolf Lioutikov

arXiv:2304.02532·cs.LG·June 2, 2023·5 cites

Goal-Conditioned Imitation Learning using Score-based Diffusion Policies

Moritz Reuss, Maximilian Li, Xiaogang Jia, Rudolf Lioutikov

PDF

Open Access 1 Repo

TL;DR

This paper introduces BESO, a novel goal-conditioned imitation learning policy using score-based diffusion models that enables fast, expressive, and multi-modal goal-specific behavior generation from large uncurated datasets.

Contribution

The paper presents BESO, a new diffusion-based policy architecture for goal-conditioned imitation learning that decouples score learning from inference, enabling fast sampling and learning both goal-dependent and goal-independent policies.

Findings

01

BESO achieves goal-specific behavior generation in just 3 denoising steps.

02

BESO outperforms state-of-the-art goal-conditioned imitation learning methods.

03

The method effectively captures multi-modality in behavior data.

Abstract

We propose a new policy representation based on score-based diffusion models (SDMs). We apply our new policy representation in the domain of Goal-Conditioned Imitation Learning (GCIL) to learn general-purpose goal-specified policies from large uncurated datasets without rewards. Our new goal-conditioned policy architecture " $BE$ havior generation with $S$ c $O$ re-based Diffusion Policies" (BESO) leverages a generative, score-based diffusion model as its policy. BESO decouples the learning of the score model from the inference sampling process, and, hence allows for fast sampling strategies to generate goal-specified behavior in just 3 denoising steps, compared to 30+ steps of other diffusion based policies. Furthermore, BESO is highly expressive and can effectively capture multi-modality present in the solution space of the play data. Unlike previous methods such…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

intuitive-robots/beso
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsDiffusion