Unified token representations for sequential decision models
Zhuojing Tian, Yushu Chen

TL;DR
This paper introduces a unified token representation for sequential decision models in reinforcement learning, reducing complexity and improving scalability while maintaining or enhancing performance.
Contribution
It proposes a novel unified token representation that merges multiple elements into a single token, reducing sequence length and complexity in RL models.
Findings
UTR reduces sequence length and model complexity.
UTR-based models achieve comparable or better performance.
Theoretical analysis indicates improved generalization.
Abstract
Transformers have demonstrated strong potential in offline reinforcement learning (RL) by modeling trajectories as sequences of return-to-go, states, and actions. However, existing approaches such as the Decision Transformer(DT) and its variants suffer from redundant tokenization and quadratic attention complexity, limiting their scalability in real-time or resource-constrained settings. To address this, we propose a Unified Token Representation (UTR) that merges return-to-go, state, and action into a single token, substantially reducing sequence length and model complexity. Theoretical analysis shows that UTR leads to a tighter Rademacher complexity bound, suggesting improved generalization. We further develop two variants: UDT and UDC, built upon transformer and gated CNN backbones, respectively. Both achieve comparable or superior performance to state-of-the-art methods with markedly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
