TrajTok: Adaptive Spatial Tokenization for Trajectory Representation Learning
Zhen Xiong, Shang-Ling Hsu, Cyrus Shahabi

TL;DR
TrajTok introduces a novel multi-resolution spatial tokenization and transformer-based encoder for learning transferable and generalizable trajectory representations from noisy GPS data.
Contribution
It proposes a new trajectory encoding method with multi-resolution spatial tokenization and a specialized transformer architecture, enabling effective pretraining and transfer learning.
Findings
Outperforms task-specific methods on various trajectory tasks.
Supports both geometry and kinematics with a single frozen encoder.
Demonstrates transferability of learned trajectory structure.
Abstract
Learning generalizable trajectory representations from raw GPS traces remains difficult because the data is continuous, noisy, and irregularly sampled. Spatial tokenization is also challenging: fine grids yield sparse cells with weak embeddings, while coarse grids merge heterogeneous movement patterns into the same token. We present TrajTok, a trajectory encoder with a simple pretraining recipe for transferable trajectory embeddings. TrajTok first learns a multi-resolution hexagonal cell partition from the spatial distribution of GPS points, converting noisy GPS sequences into discrete cell tokens. To capture both geometry and kinematics, it uses a factorized transformer encoder with early per-modality self-attention blocks, cross-attention fusion layers, and spatiotemporal rotary position embeddings, ST-RoPE, to encode where and when each token occurs. TrajTok is pretrained with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
