Pre-training Contextual Location Embeddings in Personal Trajectories via Efficient Hierarchical Location Representations
Chung Park, Taesan Kim, Junui Hong, Minsung Choi, Jaegul Choo

TL;DR
This paper introduces a Geo-Tokenizer and Hierarchical Auto-regressive Location Model to efficiently pre-train location embeddings from human mobility data, enabling scalable and accurate modeling of large-scale location data for location-based services.
Contribution
The paper presents a novel Geo-Tokenizer for reducing location vocabulary size and a hierarchical auto-regressive training method for scalable, dynamic location embedding pre-training.
Findings
Significantly improves downstream task performance
Reduces model parameters compared to existing methods
Effective on real-world trajectory datasets
Abstract
Pre-training the embedding of a location generated from human mobility data has become a popular method for location based services. In practice, modeling the location embedding is too expensive, due to the large number of locations to be trained in situations with fine-grained resolution or extensive target regions. Previous studies have handled less than ten thousand distinct locations, which is insufficient in the real-world applications. To tackle this problem, we propose a Geo-Tokenizer, designed to efficiently reduce the number of locations to be trained by representing a location as a combination of several grids at different scales. In the Geo-Tokenizer, a grid at a larger scale shares the common set of grids at smaller scales, which is a key factor in reducing the size of the location vocabulary. The sequences of locations preprocessed with the Geo-Tokenizer are utilized by a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis · Context-Aware Activity Recognition Systems · Data-Driven Disease Surveillance
