A Topic Modeling Approach to Classifying Open Street Map Health Clinics and Schools in Sub-Saharan Africa
Joshua W. Anderson, Luis I\~naki Alberro Encina, Tina George, Karippacheril, Jonathan Hersh, Cadence Stringer

TL;DR
This paper presents an unsupervised topic modeling approach to classify and locate schools and health clinics in OpenStreetMap data across ten African countries, improving data utility for policy and development efforts.
Contribution
It introduces a scalable, unsupervised method to extract public service locations from unstructured OSM data, enhancing classification accuracy over traditional key-based methods.
Findings
Improved classification performance using topic modeling.
Validated OSM-derived locations against WHO data.
Identified coverage gaps in OSM data across Africa.
Abstract
Data deprivation, or the lack of easily available and actionable information on the well-being of individuals, is a significant challenge for the developing world and an impediment to the design and operationalization of policies intended to alleviate poverty. In this paper we explore the suitability of data derived from OpenStreetMap to proxy for the location of two crucial public services: schools and health clinics. Thanks to the efforts of thousands of digital humanitarians, online mapping repositories such as OpenStreetMap contain millions of records on buildings and other structures, delineating both their location and often their use. Unfortunately much of this data is locked in complex, unstructured text rendering it seemingly unsuitable for classifying schools or clinics. We apply a scalable, unsupervised learning method to unlabeled OpenStreetMap building data to extract the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeographic Information Systems Studies · Data-Driven Disease Surveillance · ICT in Developing Communities
