SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity Representation
Zekun Li, Jina Kim, Yao-Yi Chiang, Muhao Chen

TL;DR
SpaBERT is a novel spatial language model that leverages geographic data and spatial context to generate effective geo-entity representations, improving performance in geo-entity typing and linking tasks.
Contribution
It introduces SpaBERT, a pretrained model extending BERT with spatial context encoding for geo-entities, a novel approach in geographic data analysis.
Findings
SpaBERT outperforms existing models on geo-entity typing.
SpaBERT improves geo-entity linking accuracy.
Spatial coordinate embedding enhances entity representations.
Abstract
Named geographic entities (geo-entities for short) are the building blocks of many geographic datasets. Characterizing geo-entities is integral to various application domains, such as geo-intelligence and map comprehension, while a key challenge is to capture the spatial-varying context of an entity. We hypothesize that we shall know the characteristics of a geo-entity by its surrounding entities, similar to knowing word meanings by their linguistic context. Accordingly, we propose a novel spatial language model, SpaBERT, which provides a general-purpose geo-entity representation based on neighboring entities in geospatial data. SpaBERT extends BERT to capture linearized spatial context, while incorporating a spatial coordinate embedding mechanism to preserve spatial relations of entities in the 2-dimensional space. SpaBERT is pretrained with masked language modeling and masked entity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeographic Information Systems Studies · Data Management and Algorithms · Data Quality and Management
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Residual Connection · Dropout · WordPiece · Dense Connections · Softmax
