TrajFlow: Nation-wide Pseudo GPS Trajectory Generation with Flow Matching Models
Peiran Li, Jiawei Wang, Haoran Zhang, Xiaodan Shi, Noboru Koshizuka, Chihiro Shimizu, Renhe Jiang

TL;DR
TrajFlow is a novel flow-matching-based generative model that produces high-fidelity, diverse, and scalable pseudo-GPS trajectories across multiple geographic scales, addressing privacy and efficiency limitations of existing methods.
Contribution
It introduces the first flow-matching approach for nationwide GPS trajectory generation, enhancing scalability, diversity, and efficiency over prior diffusion-based models.
Findings
Outperforms diffusion-based and deep generative models at multiple scales
Successfully generates diverse trajectories across urban, metropolitan, and nationwide levels
Demonstrates potential for urban planning, traffic management, and disaster response
Abstract
The importance of mobile phone GPS trajectory data is widely recognized across many fields, yet the use of real data is often hindered by privacy concerns, limited accessibility, and high acquisition costs. As a result, generating pseudo-GPS trajectory data has become an active area of research. Recent diffusion-based approaches have achieved strong fidelity but remain limited in spatial scale (small urban areas), transportation-mode diversity, and efficiency (requiring numerous sampling steps). To address these challenges, we introduce TrajFlow, which to the best of our knowledge is the first flow-matching-based generative model for GPS trajectory generation. TrajFlow leverages the flow-matching paradigm to improve robustness and efficiency across multiple geospatial scales, and incorporates a trajectory harmonization and reconstruction strategy to jointly address scalability,…
Peer Reviews
Decision·ICLR 2026 Poster
1. Very strong performance for trajectory generation at nation-level scale 2. The ablation study not only shows the importance of each part, but also discusses some of the inherent limitations of using a global coordinate frame when generating trajectories, which is that it introduces the risk of small details being lost when different trajectories have different scales. 3. Provides new insights into what is important for generating trajectories: trajectory simplification, normalization, and flo
1. All results are based on a single dataset, which is not publicly accessible. 2. Auxiliary data is required to sample from the model: departure times, OD pairs, and transportation modes. 3. Limited discussion around the risk of memorization. The DTW measure shows the average DTW distance to the closest real trajectory. While achieving a low score might seem positive, it could also indicate that the model has learned to copy the training set, potentially increasing the risk of leaking private
S1. This paper studies the problem of pseudo GPS trajectory generation, which seems interesting. S2. The paper presents the first flow-matching-based generative framework. S3. Experiments show that the proposed TrajFlow outperforms the existing baselines.
W1. Novelty: The paper would benefit from a deeper discussion clarifying the differences between the proposed approach and more recent baselines. W2. Datasets: Experiments are conducted on only one dataset, which limits the generalizability of the conclusions. It is recommended to include additional commonly used datasets such as Chengdu and Xi’an to strengthen the empirical validation. W3.Baseline: The baselines used for comparison (from 2020, 2021, and 2023) are relatively outdated. The pape
1. This paper presents the first application of the flow-matching paradigm to the task of GPS trajectory generation, which addresses the problem that the performance of existing models, especially diffusion models, degrades dramatically when scaling from small urban scales to regional or national scales. 2. TrajFlow is much more efficient than the computationally expensive diffusion model, which requires a large number of sampling steps. TrajFlow achieves high fidelity in generation with only ab
1. This paper lacks some details about the reproducibility of the models and algorithms, such as model architecture, parameter settings, code, algorithm process, etc. 2. This paper uses trajectory data with multiple travel modes and a national scale. However, it's not publicly available, and the authors do not present many details about the dataset. This limits the reader's ability to review the technical performance of this paper in depth. 3. The main evaluation metrics of the paper are biased
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis · Traffic Prediction and Management Techniques · Automated Road and Building Extraction
