AREAL-DTA: Dynamic Tree Attention for Efficient Reinforcement Learning of Large Language Models

Jiarui Zhang; Yuchen Yang; Ran Yan; Zhiyu Mei; Liyuan Zhang; Daifeng Li; Wei Fu; Jiaxuan Gao; Shusheng Xu; Yi Wu; Binhang Yuan

arXiv:2602.00482·cs.LG·February 3, 2026

AREAL-DTA: Dynamic Tree Attention for Efficient Reinforcement Learning of Large Language Models

Jiarui Zhang, Yuchen Yang, Ran Yan, Zhiyu Mei, Liyuan Zhang, Daifeng Li, Wei Fu, Jiaxuan Gao, Shusheng Xu, Yi Wu, Binhang Yuan

PDF

Open Access

TL;DR

AREAL-DTA introduces a dynamic tree attention method that significantly improves the efficiency of reinforcement learning training for large language models by exploiting prefix sharing and dynamic traversal, leading to up to 8.31 times higher throughput.

Contribution

It presents a novel DFS-based execution strategy and distributed batching mechanism for scalable, efficient RL training of large language models using prefix sharing.

Findings

01

Achieves up to 8.31× higher training throughput.

02

Efficiently exploits prefix sharing in RL training.

03

Scales across multiple GPUs with dynamic prefix tree processing.

Abstract

Reinforcement learning (RL) based post-training for large language models (LLMs) is computationally expensive, as it generates many rollout sequences that could frequently share long token prefixes. Existing RL frameworks usually process these sequences independently, repeatedly recomputing identical prefixes during forward and backward passes during policy model training, leading to substantial inefficiencies in computation and memory usage. Although prefix sharing naturally induces a tree structure over rollouts, prior tree-attention-based solutions rely on fully materialized attention masks and scale poorly in RL settings. In this paper, we introduce AREAL-DTA to efficiently exploit prefix sharing in RL training. AREAL-DTA employs a depth-first-search (DFS)-based execution strategy that dynamically traverses the rollout prefix tree during both forward and backward computation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification