Lost in Space: Geolocation in Event Data

Sophie J. Lee; Howard Liu; and Michael D. Ward

arXiv:1611.04837·cs.CL·August 28, 2019

Lost in Space: Geolocation in Event Data

Sophie J. Lee, Howard Liu, and Michael D. Ward

PDF

1 Repo

TL;DR

This paper presents a supervised machine learning approach to accurately identify correct location information in event texts, significantly improving geolocation accuracy over existing dictionary-based methods.

Contribution

Introduces a two-stage machine learning algorithm that classifies location words in news texts, enhancing geolocation accuracy for event data.

Findings

01

Improves geolocation accuracy by up to 25% over dictionary methods.

02

Uses contextual features like N-grams and mention frequency for classification.

03

Validated on ICEWS and OEDA datasets with positive results.

Abstract

Extracting the "correct" location information from text data, i.e., determining the place of event, has long been a goal for automated text processing. To approximate human-like coding schema, we introduce a supervised machine learning algorithm that classifies each location word to be either correct or incorrect. We use news articles collected from around the world (Integrated Crisis Early Warning System [ICEWS] data and Open Event Data Alliance [OEDA] data) to test our algorithm that consists of two stages. In the feature selection stage, we extract contextual information from texts, namely, the N-gram patterns for location words, the frequency of mention, and the context of the sentences containing location words. In the classification stage, we use three classifiers to estimate the model parameters in the training set and then to predict whether a location word in the test set news…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haoliuhoward/LostinSpace-PSRM
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.