Loglinear model selection and human mobility

Adrian Dobra; Abdolreza Mohammadi

arXiv:1711.02623·stat.ME·November 8, 2017

Loglinear model selection and human mobility

Adrian Dobra, Abdolreza Mohammadi

PDF

TL;DR

This paper introduces a new algorithm for selecting graphical loglinear models tailored for hyper-sparse contingency tables, and applies it to analyze extensive geolocated Twitter data to uncover human mobility patterns.

Contribution

It develops a novel model selection algorithm for sparse contingency tables and demonstrates its application to large-scale human mobility data from Twitter.

Findings

01

Effective modeling of human mobility patterns from Twitter data.

02

Successful analysis of a 214-variable contingency table with 46 million locations.

03

Advancement in model selection techniques for hyper-sparse tables.

Abstract

Methods for selecting loglinear models were among Steve Fienberg's research interests since the start of his long and fruitful career. After we dwell upon the string of papers focusing on loglinear models that can be partly attributed to Steve's contributions and influential ideas, we develop a new algorithm for selecting graphical loglinear models that is suitable for analyzing hyper-sparse contingency tables. We show how multi-way contingency tables can be used to represent patterns of human mobility. We analyze a dataset of geolocated tweets from South Africa that comprises 46 million latitude/longitude locations of 476,601 Twitter users that is summarized as a contingency table with 214 variables. KEYWORDS: contingency tables, model selection, human mobility, graphical models, Bayesian structural learning, birth-death processes, pseudo-likelihood

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.