DyDiff: Long-Horizon Rollout via Dynamics Diffusion for Offline Reinforcement Learning

Hanye Zhao; Xiaoshen Han; Zhengbang Zhu; Minghuan Liu; Yong Yu; De-Chuan Zhan; Weinan Zhang

arXiv:2405.19189·cs.LG·May 19, 2026

DyDiff: Long-Horizon Rollout via Dynamics Diffusion for Offline Reinforcement Learning

Hanye Zhao, Xiaoshen Han, Zhengbang Zhu, Minghuan Liu, Yong Yu, De-Chuan Zhan, Weinan Zhang

PDF

1 Repo

TL;DR

DyDiff introduces a diffusion-based method for long-horizon trajectory rollout in offline reinforcement learning, effectively injecting policy information into dynamics models to improve accuracy and consistency.

Contribution

The paper proposes DyDiff, a novel diffusion model approach that decouples dynamics learning from policy, enabling accurate long-horizon rollouts in offline RL.

Findings

01

DyDiff achieves superior long-horizon rollout accuracy.

02

It maintains policy consistency during rollouts.

03

Theoretical analysis shows advantages over traditional models.

Abstract

With the great success of diffusion models (DMs) in generating realistic synthetic vision data, many researchers have investigated their potential in decision-making and control. Most of these works utilized DMs to sample directly from the trajectory space, where DMs can be viewed as a combination of dynamics models and policies. In this work, we explore how to decouple DMs' ability as dynamics models in fully offline settings, allowing the learning policy to roll out trajectories. As DMs learn the data distribution from the dataset, their intrinsic policy is actually the behavior policy induced from the dataset, which results in a mismatch between the behavior policy and the learning policy. We propose Dynamics Diffusion, short as DyDiff, which can inject information from the learning policy to DMs iteratively. DyDiff ensures long-horizon rollout accuracy while maintaining policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fineartz/dydiff
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic control and management · Autonomous Vehicle Technology and Safety · Vehicle Dynamics and Control Systems

MethodsDiffusion