Eyes Will Shut: A Vision-Based Next GPS Location Prediction Model by Reinforcement Learning from Visual Map Feed Back
Ruixing Zhang, Yang Zhang, Tongyu Zhu, Leilei Sun, Weifeng Lv

TL;DR
This paper introduces VLMLocPredictor, a vision-based model that predicts next locations in human mobility trajectories by leveraging visual reasoning and reinforcement learning on map images, achieving state-of-the-art results.
Contribution
It proposes a novel vision-based approach using VLMs with reinforcement learning for next location prediction, mimicking human reasoning over maps.
Findings
Achieves state-of-the-art performance on multiple city datasets.
Demonstrates superior cross-city generalization.
Enables models to reason over maps similarly to humans.
Abstract
Next Location Prediction is a fundamental task in the study of human mobility, with wide-ranging applications in transportation planning, urban governance, and epidemic forecasting. In practice, when humans attempt to predict the next location in a trajectory, they often visualize the trajectory on a map and reason based on road connectivity and movement trends. However, the vast majority of existing next-location prediction models do not reason over maps \textbf{in the way that humans do}. Fortunately, the recent development of Vision-Language Models (VLMs) has demonstrated strong capabilities in visual perception and even visual reasoning. This opens up a new possibility: by rendering both the road network and trajectory onto an image and leveraging the reasoning abilities of VLMs, we can enable models to perform trajectory inference in a human-like manner. To explore this idea, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis · Automated Road and Building Extraction · Data Management and Algorithms
