Testing Normality of Data Transformed by Maximum Likelihood Box Cox

Douglas M Hawkins

arXiv:2407.19329·stat.ME·September 23, 2024·1 cites

Testing Normality of Data Transformed by Maximum Likelihood Box Cox

Douglas M Hawkins

PDF

Open Access

TL;DR

This paper investigates the bias in normality tests after Box-Cox transformations and proposes a recalibration method to correct this bias, enhancing the reliability of parametric analyses in biomarker and environmental studies.

Contribution

It introduces a recalibration approach to correct bias in normality tests applied to Box-Cox transformed data, including the Anderson Darling and Shapiro-Wilk tests.

Findings

01

Bias in normality tests is severe after Box-Cox transformation.

02

Recalibration effectively reduces bias in multiple normality tests.

03

Improved normality testing supports more accurate parametric analysis.

Abstract

Transforming a random variable to improve its normality leads to a followup test for whether the transformed variable follows a normal distribution. Previous work has shown that the Anderson Darling test for normality suffers from resubstitution bias following Box-Cox transformation, and indicates normality much too often. The work reported here extends this by adding the Shapiro-Wilk statistic and the two-parameter Box Cox transformation, all of which show severe bias. We also develop a recalibration to correct the bias in all four settings. The methodology was motivated by finding reference ranges in biomarker studies where parametric analysis, possibly on a power-transformed measurand, can be much more informative than nonparametric. Setting environmental standards illustrates another potential application.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods in Clinical Trials · Statistical Methods and Bayesian Inference · Statistical Distribution Estimation and Applications