Dynamic Tokenization via Reinforcement Patching: End-to-end Training and Zero-shot Transfer
Yulun Wu, Sravan Kumar Ankireddy, Samuel Sharpe, Nikita Seleznev, Dehao Yuan, Hyeji Kim, Nam H. Nguyen

TL;DR
ReinPatch introduces a reinforcement learning framework for dynamic, data-driven sequence patching that jointly optimizes patch boundaries and model performance, enhancing long-horizon sequence modeling.
Contribution
It is the first method to jointly optimize sequence patching policies and models end-to-end using reinforcement learning, enabling adaptive, hierarchical, and rate-controlled sequence representations.
Findings
ReinPatch outperforms state-of-the-art patching strategies on time-series forecasting.
The framework allows explicit control over compression rates.
The patching module can be extracted as a standalone component for analysis.
Abstract
Efficiently aggregating spatial or temporal horizons to acquire compact representations has become a unifying principle in modern deep learning models, yet learning data-adaptive representations for long-horizon sequence data, especially continuous sequences like time series, remains an open challenge. While fixed-size patching has improved scalability and performance, discovering variable-sized, data-driven patches end-to-end often forces models to rely on soft discretization, specific backbones, or heuristic rules. In this work, we propose Reinforcement Patching (ReinPatch), the first framework to jointly optimize a sequence patching policy and its downstream sequence backbone model using reinforcement learning. By formulating patch boundary placement as a discrete decision process optimized via Group Relative Policy Gradient (GRPG), ReinPatch bypasses the need for continuous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
