Demo2Vec: Learning Region Embedding with Demographic Information

Ya Wen; Yulun Zhou

arXiv:2409.16837·cs.LG·September 26, 2024

Demo2Vec: Learning Region Embedding with Demographic Information

Ya Wen, Yulun Zhou

PDF

Open Access

TL;DR

This paper introduces Demo2Vec, a novel region embedding method that integrates demographic data with mobility information, improving urban prediction tasks and proposing a new divergence measure for multi-view learning.

Contribution

Demo2Vec is the first to effectively incorporate demographic data into region embedding, enhancing predictive performance across urban tasks.

Findings

01

Mobility + income data yields up to 10.22% better predictions.

02

Jenson-Shannon divergence outperforms KL divergence for multi-view learning.

03

Geographic proximity + income is an effective alternative in data-scarce settings.

Abstract

Demographic data, such as income, education level, and employment rate, contain valuable information of urban regions, yet few studies have integrated demographic information to generate region embedding. In this study, we show how the simple and easy-to-access demographic data can improve the quality of state-of-the-art region embedding and provide better predictive performances in urban areas across three common urban tasks, namely check-in prediction, crime rate prediction, and house price prediction. We find that existing pre-train methods based on KL divergence are potentially biased towards mobility information and propose to use Jenson-Shannon divergence as a more appropriate loss function for multi-view representation learning. Experimental results from both New York and Chicago show that mobility + income is the best pre-train data combination, providing up to 10.22\% better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Image Retrieval and Classification Techniques