Goal-Conditioned Imitation Learning using Score-based Diffusion Policies
Moritz Reuss, Maximilian Li, Xiaogang Jia, Rudolf Lioutikov

TL;DR
This paper introduces BESO, a novel goal-conditioned imitation learning policy using score-based diffusion models that enables fast, expressive, and multi-modal goal-specific behavior generation from large uncurated datasets.
Contribution
The paper presents BESO, a new diffusion-based policy architecture for goal-conditioned imitation learning that decouples score learning from inference, enabling fast sampling and learning both goal-dependent and goal-independent policies.
Findings
BESO achieves goal-specific behavior generation in just 3 denoising steps.
BESO outperforms state-of-the-art goal-conditioned imitation learning methods.
The method effectively captures multi-modality in behavior data.
Abstract
We propose a new policy representation based on score-based diffusion models (SDMs). We apply our new policy representation in the domain of Goal-Conditioned Imitation Learning (GCIL) to learn general-purpose goal-specified policies from large uncurated datasets without rewards. Our new goal-conditioned policy architecture "havior generation with cre-based Diffusion Policies" (BESO) leverages a generative, score-based diffusion model as its policy. BESO decouples the learning of the score model from the inference sampling process, and, hence allows for fast sampling strategies to generate goal-specified behavior in just 3 denoising steps, compared to 30+ steps of other diffusion based policies. Furthermore, BESO is highly expressive and can effectively capture multi-modality present in the solution space of the play data. Unlike previous methods such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsDiffusion
