Profiting from correlations: Adjusted estimators for categorical data
Tobias Niebuhr, Mathias Trabs

TL;DR
This paper introduces weighted estimators for contingency tables that leverage correlations between marginals to reduce variance and correct biases in categorical data analysis.
Contribution
It proposes new estimators assuming known marginals, demonstrating variance reduction when marginals are correlated, with practical application to traffic accident data.
Findings
Weighted estimators have smaller asymptotic variance when marginals are correlated.
Simulation studies confirm improved finite sample performance.
Application corrects bias in injury severity distribution in traffic data.
Abstract
To take sample biases and skewness in the observations into account, practitioners frequently weight their observations according to some marginal distribution. The present paper demonstrates that such weighting can indeed improve the estimation. Studying contingency tables, estimators for marginal distributions are proposed under the assumption that another marginal is known. It is shown that the weighted estimators have a strictly smaller asymptotic variance whenever the two marginals are correlated. The finite sample performance is illustrated in a simulation study. As an application to traffic accident data the method allows for correcting a well-known bias in the observed injury severity distribution.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
