Distribution Fitting 2. Pearson-Fisher, Kolmogorov-Smirnov, Anderson-Darling, Wilks-Shapiro, Cramer-von-Misses and Jarque-Bera statistics
Lorentz Jantschi, Sorana D. Bolboaca

TL;DR
This paper reviews various statistical tests for distribution fitting, applying them to chemical data sets to evaluate normality, outlier effects, and the robustness of these tests in practical scenarios.
Contribution
It provides a comparative analysis of multiple distribution fitting statistics on real chemical data, highlighting their sensitivity to outliers and robustness.
Findings
Kolmogorov-Smirnov is less affected by outliers.
Outliers cause errors in Anderson-Darling and Kolmogorov-Smirnov tests.
Normality was rejected for both data sets after testing.
Abstract
The methods measuring the departure between observation and the model were reviewed. The following statistics were applied on two experimental data sets: Chi-Squared, Kolmogorov-Smirnov, Anderson-Darling, Wilks-Shapiro, and Jarque-Bera. Both investigated sets proved not to be normal distributed. The Grubbs test identified one outlier and after its removal the normality of the set of 205 chemical active compounds was accepted. The second data set proved not to have any outliers. Kolmogorov-Smirnov statistic is less affected by the existence of outliers (positive variation expressed as percentage smaller than 2). The outliers bring to Kolmogorov-Smirnov statistic errors of type II and to the Anderson-Darling statistic errors of type I.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Distribution Estimation and Applications · Spectroscopy and Chemometric Analyses
