Data-driven Smooth Tests for Normality in ANOVA When the Number of Groups is Large
Peiwen Jia, Xiaojun Song, Haoyu Wei

TL;DR
This paper introduces Neyman's smooth tests for normality in ANOVA models with many groups, using residuals and a data-driven approach to determine test complexity, supported by simulations and real data.
Contribution
It develops a novel, fully data-driven normality test for ANOVA with diverging groups, accounting for residual estimation effects and asymptotic properties.
Findings
Test statistics follow an asymptotic Chi-square distribution under null hypothesis.
The modified Schwarz's rule effectively determines the test order automatically.
Simulation and real data show the test's good practical performance.
Abstract
The normality assumption for random errors is fundamental in the analysis of variance (ANOVA) models. However, it is rarely subjected to formal testing in practice, and theoretically justified procedures are largely unavailable, especially when the number of groups diverges. In this paper, we develop Neyman's smooth tests for assessing normality in a broad class of ANOVA models, allowing the number of groups to diverge. The proposed test statistics are constructed via the Gaussian probability integral transformation of ANOVA residuals. We show that using residuals induces non-negligible parameter estimation effects, whose structure depends on the underlying ANOVA model and plays a crucial role in shaping the form of the test statistics and their asymptotic behavior. Under the null hypothesis of normality, the resulting statistics follow an asymptotic Chi-square distribution, with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
