GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale
Nicolas Tempelmeier, Simon Gottschalk, Elena Demidova

TL;DR
GeoVectors is a comprehensive, world-scale linked open corpus of OSM entity embeddings that captures semantic and geographic information, enabling machine learning and semantic applications on over 980 million geographic entities.
Contribution
It introduces a novel, large-scale linked corpus of OSM embeddings with semantic links to Wikidata and DBpedia, facilitating advanced geographic data analysis.
Findings
Coverage of over 980 million entities across 180 countries
Semantic links to Wikidata and DBpedia for contextual information
Accessible via a SPARQL endpoint for direct querying
Abstract
OpenStreetMap (OSM) is currently the richest publicly available information source on geographic entities (e.g., buildings and roads) worldwide. However, using OSM entities in machine learning models and other applications is challenging due to the large scale of OSM, the extreme heterogeneity of entity annotations, and a lack of a well-defined ontology to describe entity semantics and properties. This paper presents GeoVectors - a unique, comprehensive world-scale linked open corpus of OSM entity embeddings covering the entire OSM dataset and providing latent representations of over 980 million geographic entities in 180 countries. The GeoVectors corpus captures semantic and geographic dimensions of OSM entities and makes these entities directly accessible to machine learning algorithms and semantic applications. We create a semantic description of the GeoVectors corpus, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
