TMRL: Diffusion Timestep-Modulated Pretraining Enables Exploration for Efficient Policy Finetuning

Matthew M. Hong; Jesse Zhang; Anusha Nagabandi; Abhishek Gupta

arXiv:2605.12236·cs.RO·May 13, 2026

TMRL: Diffusion Timestep-Modulated Pretraining Enables Exploration for Efficient Policy Finetuning

Matthew M. Hong, Jesse Zhang, Anusha Nagabandi, Abhishek Gupta

PDF

1 Repo

TL;DR

This paper introduces a novel framework combining diffusion-based pretraining and timestep-modulated reinforcement learning to enhance exploration and sample efficiency in robot policy fine-tuning, achieving real-world success in under an hour.

Contribution

It proposes CSP for pretraining with diffusion noise and TMRL for dynamic exploration control during RL fine-tuning, improving efficiency and applicability across various policy inputs.

Findings

01

TMRL improves RL fine-tuning sample efficiency.

02

TMRL enables real-world manipulation tasks in under one hour.

03

The framework seamlessly integrates with different policy input modalities.

Abstract

Fine-tuning pre-trained robot policies with reinforcement learning (RL) often inherits the bottlenecks introduced by pre-training with behavioral cloning (BC), which produces narrow action distributions that lack the coverage necessary for downstream exploration. We present a unified framework that enables the exploration necessary to enable efficient robot policy finetuning by bridging BC pre-training and RL fine-tuning. Our pre-training method, Context-Smoothed Pre-training (CSP), injects forward-diffusion noise into policy inputs, creating a continuum between precise imitation and broad action coverage. We then fine-tune pre-trained policies via Timestep-Modulated Reinforcement Learning (TMRL), which trains the agent to dynamically adjust this conditioning during fine-tuning by modulating the diffusion timestep, granting explicit control over exploration. Integrating seamlessly with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://weirdlabuw.github.io/tmrl
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.