Reinforcement Learning with Action Chunking

Qiyang Li; Zhiyuan Zhou; Sergey Levine

arXiv:2507.07969·cs.LG·May 12, 2026

Reinforcement Learning with Action Chunking

Qiyang Li, Zhiyuan Zhou, Sergey Levine

PDF

1 Video

TL;DR

Q-chunking introduces action chunking into RL to improve exploration and sample efficiency in long-horizon, sparse-reward tasks, especially in offline-to-online settings.

Contribution

It applies action chunking directly to the RL action space, enhancing offline data utilization and online exploration in long-horizon tasks.

Findings

01

Outperforms prior offline-to-online RL methods on manipulation tasks.

02

Achieves strong offline performance and online sample efficiency.

03

Leverages temporally consistent behaviors for better exploration.

Abstract

We present Q-chunking, a simple yet effective recipe for improving reinforcement learning (RL) algorithms for long-horizon, sparse-reward tasks. Our recipe is designed for the offline-to-online RL setting, where the goal is to leverage an offline prior dataset to maximize the sample-efficiency of online learning. Effective exploration and sample-efficient learning remain central challenges in this setting, as it is not obvious how the offline data should be utilized to acquire a good exploratory policy. Our key insight is that action chunking, a technique popularized in imitation learning where sequences of future actions are predicted rather than a single action at each timestep, can be applied to temporal difference (TD)-based RL methods to mitigate the exploration challenge. Q-chunking adopts action chunking by directly running RL in a 'chunked' action space, enabling the agent to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Reinforcement Learning with Action Chunking· slideslive