Modeling Latent Underdispersion with Discrete Order Statistics
Jimmy Lederman, Aaron Schein

TL;DR
This paper introduces a novel class of models based on discrete order statistics to better capture underdispersion in count data, providing a flexible and interpretable alternative to traditional Poisson models.
Contribution
The paper develops a new modeling framework using discrete order statistics, with a modular data augmentation scheme, to effectively model underdispersed count data and improve fit.
Findings
Order statistic models often outperform traditional Poisson models in fit.
The framework is applicable to diverse data types like flight times and RNA sequencing.
Properties of Poisson and negative binomial order statistics are characterized.
Abstract
The Poisson distribution is the default choice of likelihood for probabilistic models of count data. However, due to the equidispersion contraint of the Poisson, such models may have predictive uncertainty that is artificially inflated. While overdispersion has been extensively studied, conditional underdispersion -- where latent structure renders data more regular than Poisson -- remains underexplored, in part due to the lack of tractable modeling tools. We introduce a new class of models based on discrete order statistics, where observed counts are assumed to be an order statistic (e.g., minimum, median, maximum) of i.i.d. draws from some discrete parent, such as the Poisson or negative binomial. We develop a general data augmentation scheme that is modular with existing tools tailored to the parent distribution, enabling parameter estimation or posterior inference in a wide range of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
