UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web
Yibo Yan, Haomin Wen, Siru Zhong, Wei Chen, Haodong Chen, Qingsong, Wen, Roger Zimmermann, Yuxuan Liang

TL;DR
UrbanCLIP introduces a novel framework that integrates textual descriptions generated by LLMs with satellite imagery to enhance urban region profiling, demonstrating significant improvements over existing methods in predicting urban indicators.
Contribution
This paper pioneers the integration of textual modality via LLMs into urban imagery profiling, creating the first LLM-enhanced contrastive learning framework for this task.
Findings
Achieved an average 6.1% improvement in R^2 for urban indicator prediction
Demonstrated the effectiveness of text-image joint learning in urban profiling
Provided a new dataset and code for future research
Abstract
Urban region profiling from web-sourced data is of utmost importance for urban planning and sustainable development. We are witnessing a rising trend of LLMs for various fields, especially dealing with multi-modal data research such as vision-language learning, where the text modality serves as a supplement information for the image. Since textual modality has never been introduced into modality combinations in urban region profiling, we aim to answer two fundamental questions in this paper: i) Can textual modality enhance urban region profiling? ii) and if so, in what ways and with regard to which aspects? To answer the questions, we leverage the power of Large Language Models (LLMs) and introduce the first-ever LLM-enhanced framework that integrates the knowledge of textual modality into urban imagery profiling, named LLM-enhanced Urban Region Profiling with Contrastive Language-Image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeographic Information Systems Studies · Human Mobility and Location-Based Analysis · Text and Document Classification Technologies
