A Neural Model for User Geolocation and Lexical Dialectology

Afshin Rahimi; Trevor Cohn; Timothy Baldwin

arXiv:1704.04008·cs.CL·April 28, 2017

A Neural Model for User Geolocation and Lexical Dialectology

Afshin Rahimi, Trevor Cohn, Timothy Baldwin

PDF

TL;DR

This paper introduces a neural network model for user geolocation from text that achieves state-of-the-art results and also produces useful dialect embeddings, supported by a new dialect term dataset.

Contribution

The paper presents a simple neural model that improves geolocation accuracy and provides a new dataset for dialect term detection.

Findings

01

Achieves state-of-the-art performance on Twitter geolocation datasets

02

Produces meaningful word and phrase embeddings for dialect detection

03

Releases DAREDS, a new dialect term evaluation dataset

Abstract

We propose a simple yet effective text- based user geolocation model based on a neural network with one hidden layer, which achieves state of the art performance over three Twitter benchmark geolocation datasets, in addition to producing word and phrase embeddings in the hidden layer that we show to be useful for detecting dialectal terms. As part of our analysis of dialectal terms, we release DAREDS, a dataset for evaluating dialect term detection methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.