Instructed Diffuser with Temporal Condition Guidance for Offline   Reinforcement Learning

Jifeng Hu; Yanchao Sun; Sili Huang; SiYuan Guo; Hechang Chen; Li Shen,; Lichao Sun; Yi Chang; Dacheng Tao

arXiv:2306.04875·cs.LG·June 9, 2023·1 cites

Instructed Diffuser with Temporal Condition Guidance for Offline Reinforcement Learning

Jifeng Hu, Yanchao Sun, Sili Huang, SiYuan Guo, Hechang Chen, Li Shen,, Lichao Sun, Yi Chang, Dacheng Tao

PDF

Open Access

TL;DR

This paper introduces TCD, a temporally-conditional diffusion model that leverages historical, immediate, and prospective sequence information to improve controllable generation in offline reinforcement learning tasks.

Contribution

The paper proposes TCD, a novel diffusion model that explicitly incorporates multiple temporal conditions derived from interaction sequences for enhanced offline RL performance.

Findings

01

TCD matches or exceeds state-of-the-art performance in offline RL tasks.

02

Explicit temporal conditioning improves controllability and generation quality.

03

Comprehensive analysis demonstrates the effectiveness of temporal conditions in sequential decision-making.

Abstract

Recent works have shown the potential of diffusion models in computer vision and natural language processing. Apart from the classical supervised learning fields, diffusion models have also shown strong competitiveness in reinforcement learning (RL) by formulating decision-making as sequential generation. However, incorporating temporal information of sequential data and utilizing it to guide diffusion models to perform better generation is still an open challenge. In this paper, we take one step forward to investigate controllable generation with temporal conditions that are refined from temporal information. We observe the importance of temporal conditions in sequential generation in sufficient explorative scenarios and provide a comprehensive discussion and comparison of different temporal conditions. Based on the observations, we propose an effective temporally-conditional diffusion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsDiffusion