Visualizing Count Data Regressions Using Rootograms
Christian Kleiber, Achim Zeileis

TL;DR
This paper extends the rootogram graphical tool to regression models, aiding in diagnosing issues like overdispersion and excess zeros in count data, with practical applications demonstrated through real datasets and an R package implementation.
Contribution
It introduces a novel extension of rootograms for regression models, including weighted versions for out-of-sample and subset analysis, enhancing diagnostic capabilities for count data models.
Findings
Effective diagnosis of overdispersion and excess zeros.
Demonstrated utility with real datasets from ethology, public health, and finance.
Provided an R package for practical implementation.
Abstract
The rootogram is a graphical tool associated with the work of J. W. Tukey that was originally used for assessing goodness of fit of univariate distributions. Here we extend the rootogram to regression models and show that this is particularly useful for diagnosing and treating issues such as overdispersion and/or excess zeros in count data models. We also introduce a weighted version of the rootogram that can be applied out of sample or to (weighted) subsets of the data, e.g., in finite mixture models. An empirical illustration revisiting a well-known data set from ethology is included, for which a negative binomial hurdle model is employed. Supplementary materials providing two further illustrations are available online: the first, using data from public health, employs a two-component finite mixture of negative binomial models, the second, using data from finance, involves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
