Dynamic Tokenization via Reinforcement Patching: End-to-end Training and Zero-shot Transfer

Yulun Wu; Sravan Kumar Ankireddy; Samuel Sharpe; Nikita Seleznev; Dehao Yuan; Hyeji Kim; Nam H. Nguyen

arXiv:2603.26097·cs.LG·March 30, 2026

Dynamic Tokenization via Reinforcement Patching: End-to-end Training and Zero-shot Transfer

Yulun Wu, Sravan Kumar Ankireddy, Samuel Sharpe, Nikita Seleznev, Dehao Yuan, Hyeji Kim, Nam H. Nguyen

PDF

TL;DR

ReinPatch introduces a reinforcement learning framework for dynamic, data-driven sequence patching that jointly optimizes patch boundaries and model performance, enhancing long-horizon sequence modeling.

Contribution

It is the first method to jointly optimize sequence patching policies and models end-to-end using reinforcement learning, enabling adaptive, hierarchical, and rate-controlled sequence representations.

Findings

01

ReinPatch outperforms state-of-the-art patching strategies on time-series forecasting.

02

The framework allows explicit control over compression rates.

03

The patching module can be extracted as a standalone component for analysis.

Abstract

Efficiently aggregating spatial or temporal horizons to acquire compact representations has become a unifying principle in modern deep learning models, yet learning data-adaptive representations for long-horizon sequence data, especially continuous sequences like time series, remains an open challenge. While fixed-size patching has improved scalability and performance, discovering variable-sized, data-driven patches end-to-end often forces models to rely on soft discretization, specific backbones, or heuristic rules. In this work, we propose Reinforcement Patching (ReinPatch), the first framework to jointly optimize a sequence patching policy and its downstream sequence backbone model using reinforcement learning. By formulating patch boundary placement as a discrete decision process optimized via Group Relative Policy Gradient (GRPG), ReinPatch bypasses the need for continuous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.