
TL;DR
This paper demonstrates that many statistical methods inherently have singularities due to topological obstructions, affecting their stability and continuity, with implications for various data analysis techniques.
Contribution
It establishes lower bounds on the size of singularity sets for broad classes of data maps, revealing fundamental topological limitations of statistical procedures.
Findings
Broad classes of statistical methods must have singularities.
Lower bounds on Hausdorff dimension of singularity sets.
Applications to plane fitting, spherical data, and linear classification.
Abstract
Statistical data by their very nature are indeterminate in the sense that if one repeats the process of collecting the data the new data set will be different from the original. But two data sets generated in the same way should ``tell the same story''. Therefore, a statistical method, a map taking a data set to a point in some space , should be stable at : Small perturbations in should result in a small change in . Otherwise, is useless at or -- and this is important -- near . So one doesn't want to have "singularities," data sets such that the the limit of as approaches doesn't exist. (The same issue arises elsewhere in applied math.) We prove that broad classes of statistical methods have topological obstructions to continuity: They must have singularities. We derive broadly applicable lower bounds on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
