A graphical method of cumulative differences between two subpopulations

Mark Tygert

arXiv:2108.02666·stat.ME·December 20, 2021

A graphical method of cumulative differences between two subpopulations

Mark Tygert

PDF

Open Access 1 Repo

TL;DR

This paper introduces a graphical and scalar method for comparing outcomes between two subpopulations based on their scores, avoiding arbitrary binning and providing clearer insights into distribution differences.

Contribution

It develops cumulative difference methods analogous to Kolmogorov-Smirnov tests, tailored for discrete outcomes and non-equal scores in subpopulation comparisons.

Findings

01

Eliminates the need for binning in distribution comparison plots.

02

Provides scalar metrics similar to Kolmogorov-Smirnov statistics.

03

Enhances interpretation of differences between subpopulations.

Abstract

Comparing the differences in outcomes (that is, in "dependent variables") between two subpopulations is often most informative when comparing outcomes only for individuals from the subpopulations who are similar according to "independent variables." The independent variables are generally known as "scores," as in propensity scores for matching or as in the probabilities predicted by statistical or machine-learned models, for example. If the outcomes are discrete, then some averaging is necessary to reduce the noise arising from the outcomes varying randomly over those discrete values in the observed data. The traditional method of averaging is to bin the data according to the scores and plot the average outcome in each bin against the average score in the bin. However, such binning can be rather arbitrary and yet greatly impacts the interpretation of displayed deviation between the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/fbcddisgraph
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Neural Networks and Applications · Advanced Statistical Methods and Models