DRDT3: Diffusion-Refined Decision Test-Time Training Model

Xingshuai Huang; Di Wu; Benoit Boulet

arXiv:2501.06718·cs.LG·September 18, 2025

DRDT3: Diffusion-Refined Decision Test-Time Training Model

Xingshuai Huang, Di Wu, Benoit Boulet

PDF

Open Access

TL;DR

DRDT3 introduces a novel framework combining decision transformers, RNN-based test-time training, and diffusion models to enhance trajectory modeling and policy optimization in offline reinforcement learning tasks.

Contribution

It proposes the Decision TTT (DT3) module and a unified diffusion-refined framework, achieving superior performance over existing decision transformer and offline RL methods.

Findings

01

DRDT3 outperforms standard Decision Transformer on multiple tasks.

02

DRDT3 achieves state-of-the-art results in the D4RL benchmark.

03

The diffusion refinement process improves policy quality progressively.

Abstract

Decision Transformer (DT), a trajectory modelling method, has shown competitive performance compared to traditional offline reinforcement learning (RL) approaches on various classic control tasks. However, it struggles to learn optimal policies from suboptimal, reward-labelled trajectories. In this study, we explore the use of conditional generative modelling to facilitate trajectory stitching given its high-quality data generation ability. Additionally, recent advancements in Recurrent Neural Networks (RNNs) have shown their linear complexity and competitive sequence modelling performance over Transformers. We leverage the Test-Time Training (TTT) layer, an RNN that updates hidden states during testing, to model trajectories in the form of DT. We introduce a unified framework, called Diffusion-Refined Decision TTT (DRDT3), to achieve performance beyond DT models. Specifically, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning

MethodsAttention Is All You Need · Absolute Position Encodings · Adam · Residual Connection · Dropout · Softmax · Byte Pair Encoding · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer