ACSAC: Adaptive Chunk Size Actor-Critic with Causal Transformer Q-Network

Qian Chen; Junqiao Zhao; Hongtu Zhou; Hang Yu; Yanping Zhao; Chen Ye; and Guang Chen

arXiv:2605.11009·cs.LG·May 13, 2026

ACSAC: Adaptive Chunk Size Actor-Critic with Causal Transformer Q-Network

Qian Chen, Junqiao Zhao, Hongtu Zhou, Hang Yu, Yanping Zhao, Chen Ye, and Guang Chen

PDF

TL;DR

ACSAC introduces an adaptive chunk size actor-critic method using a causal Transformer critic to dynamically select action chunk lengths, improving performance on long-horizon, sparse-reward tasks without manual tuning.

Contribution

It proposes ACSAC, a novel reinforcement learning algorithm that adaptively chooses action chunk sizes with a Transformer critic, addressing fixed chunk size limitations.

Findings

01

Achieves state-of-the-art results on long-horizon manipulation tasks.

02

Demonstrates effective adaptive chunk size selection without task-specific tuning.

03

Proves the Bellman operator for ACSAC is a contraction with a unique fixed point.

Abstract

Long-horizon, sparse-reward tasks pose a fundamental challenge for reinforcement learning, since single-step TD learning suffers from bootstrapping error accumulation across successive Bellman updates. Actor-critic methods with action chunking address this by operating over temporally extended actions, which reduce the effective horizon, enable fast value backups, and support temporally consistent exploration. However, existing methods rely on a fixed chunk size and therefore cannot adaptively balance reactivity against temporal consistency. A large fixed chunk size reduces responsiveness to new observations, while a small one produces incoherent motions, forcing task-specific tuning of the chunk size. To address this limitation, we propose Adaptive Chunk Size Actor-Critic (ACSAC). ACSAC leverages a causal Transformer critic to evaluate expected returns for action chunks of different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.