Continuous Representation of Location for Geolocation and Lexical   Dialectology using Mixture Density Networks

Afshin Rahimi; Timothy Baldwin; Trevor Cohn

arXiv:1708.04358·cs.CL·August 16, 2017

Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks

Afshin Rahimi, Timothy Baldwin, Trevor Cohn

PDF

1 Repo

TL;DR

This paper introduces a neural network model using Gaussian mixtures to embed locations in a continuous space, improving geolocation accuracy and lexical dialectology analysis from Twitter data.

Contribution

It presents a novel neural network approach with mixture density outputs for continuous location embedding, outperforming traditional methods in geolocation and dialectology tasks.

Findings

01

Outperforms regression-based geolocation methods

02

Provides better uncertainty estimates in location predictions

03

Effective in lexical dialectology using Twitter data

Abstract

We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating mixtures of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology. Evaluated over Twitter data, the proposed model outperforms conventional regression-based geolocation and provides a better estimate of uncertainty. We also show the effectiveness of the representation for predicting words from location in lexical dialectology, and evaluate it using the DARE dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

afshinrahimi/geomdn
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.