Statistical Distances and Their Role in Robustness
Marianthi Markatou, Yang Chen, Georgios Afendras, Bruce G. Lindsay

TL;DR
This paper explores the fundamental role of statistical distances in statistical inference and machine learning, analyzing their properties and impact on robustness and estimator behavior.
Contribution
It provides a detailed analysis of key statistical distances, highlighting their influence on robustness and estimator properties, with specific focus on Neyman's and Pearson's chi-squared statistics.
Findings
Neyman's chi-squared exhibits robust properties.
Pearson's chi-squared is less robust.
Discretization affects robustness of statistical distances.
Abstract
Statistical distances, divergences, and similar quantities have a large history and play a fundamental role in statistics, machine learning and associated scientific disciplines. However, within the statistical literature, this extensive role has too often been played out behind the scenes, with other aspects of the statistical problems being viewed as more central, more interesting, or more important. The behind the scenes role of statistical distances shows up in estimation, where we often use estimators based on minimizing a distance, explicitly or implicitly, but rarely studying how the properties of a distance determine the properties of the estimators. Distances are also prominent in goodness-of-fit, but the usual question we ask is "how powerful is this method against a set of interesting alternatives" not "what aspect of the distance between the hypothetical model and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
