Diffusion Models as Optimizers for Efficient Planning in Offline RL

Renming Huang; Yunqiang Pei; Guoqing Wang; Yangming Zhang; Yang Yang,; Peng Wang; Hengtao Shen

arXiv:2407.16142·cs.LG·July 24, 2024

Diffusion Models as Optimizers for Efficient Planning in Offline RL

Renming Huang, Yunqiang Pei, Guoqing Wang, Yangming Zhang, Yang Yang,, Peng Wang, Hengtao Shen

PDF

1 Repo

TL;DR

This paper introduces the Trajectory Diffuser, a method that decomposes diffusion model sampling into trajectory generation and optimization, significantly improving planning efficiency in offline reinforcement learning without sacrificing performance.

Contribution

It proposes a novel decomposition approach and a hybrid model combining autoregressive and diffusion techniques for faster, high-quality offline RL planning.

Findings

01

Achieves 3-10x faster inference speed than previous methods.

02

Outperforms existing sequence modeling approaches in overall performance.

03

Demonstrates effectiveness on D4RL benchmarks.

Abstract

Diffusion models have shown strong competitiveness in offline reinforcement learning tasks by formulating decision-making as sequential generation. However, the practicality of these methods is limited due to the lengthy inference processes they require. In this paper, we address this problem by decomposing the sampling process of diffusion models into two decoupled subprocesses: 1) generating a feasible trajectory, which is a time-consuming process, and 2) optimizing the trajectory. With this decomposition approach, we are able to partially separate efficiency and quality factors, enabling us to simultaneously gain efficiency advantages and ensure quality assurance. We propose the Trajectory Diffuser, which utilizes a faster autoregressive model to handle the generation of feasible trajectories while retaining the trajectory optimization process of diffusion models. This allows us to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

renming-huang/trajectorydiffuser
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings