DODT: Enhanced Online Decision Transformer Learning through Dreamer's   Actor-Critic Trajectory Forecasting

Eric Hanchen Jiang; Zhi Zhang; Dinghuai Zhang; Andrew Lizarraga,; Chenheng Xu; Yasi Zhang; Siyan Zhao; Zhengjie Xu; Peiyu Yu; Yuer Tang; Deqian; Kong; Ying Nian Wu

arXiv:2410.11359·cs.LG·October 16, 2024

DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting

Eric Hanchen Jiang, Zhi Zhang, Dinghuai Zhang, Andrew Lizarraga,, Chenheng Xu, Yasi Zhang, Siyan Zhao, Zhengjie Xu, Peiyu Yu, Yuer Tang, Deqian, Kong, Ying Nian Wu

PDF

Open Access

TL;DR

This paper presents DODT, a novel reinforcement learning framework that combines Dreamer's trajectory forecasting with decision transformers, leading to improved sample efficiency and robustness in complex decision-making tasks.

Contribution

The paper introduces an integrated approach that merges Dreamer's world model with decision transformers, enabling bidirectional learning and enhanced performance in reinforcement learning.

Findings

01

Achieves better sample efficiency than existing methods.

02

Demonstrates improved reward maximization across benchmarks.

03

Shows robustness in diverse, dynamic environments.

Abstract

Advancements in reinforcement learning have led to the development of sophisticated models capable of learning complex decision-making tasks. However, efficiently integrating world models with decision transformers remains a challenge. In this paper, we introduce a novel approach that combines the Dreamer algorithm's ability to generate anticipatory trajectories with the adaptive learning strengths of the Online Decision Transformer. Our methodology enables parallel training where Dreamer-produced trajectories enhance the contextual decision-making of the transformer, creating a bidirectional enhancement loop. We empirically demonstrate the efficacy of our approach on a suite of challenging benchmarks, achieving notable improvements in sample efficiency and reward maximization over existing methods. Our results indicate that the proposed integrated framework not only accelerates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications

MethodsDense Connections · Residual Connection · Dropout · Layer Normalization · Adam · Byte Pair Encoding · Absolute Position Encodings · Softmax · Attention Is All You Need · Linear Layer