A test statistic, $h^*$, for outlier analysis
Johan F. Hoorn, Johnny K. W. Ho

TL;DR
This paper introduces the h* statistic, a new parametric method for outlier detection that does not assume normality, providing a more meaningful assessment of outliers in complex datasets.
Contribution
The paper presents the h* statistic, a novel outlier evaluation method that assesses extremity relative to data groups, extending traditional techniques without normality assumptions.
Findings
h* effectively distinguishes meaningful outliers from mere extremities
Empirical validation using mood intervention data demonstrates h*'s robustness
Extensions include Bayesian inference and weighted, nuanced analysis
Abstract
Outlier analysis is a critical tool across diverse domains, from clinical decision-making to cybersecurity and talent identification. Traditional statistical outlier detection methods, such as Grubb's test and Dixon's Q, are predicated on the assumption of normality and often fail to reckon the meaningfulness of exceptional values within non-normal datasets. In this paper, we introduce the h* statistic, a novel parametric, frequentist approach for evaluating global outliers without the normality assumption. Unlike conventional techniques that primarily remove outliers to preserve statistical `integrity,' h* assesses the distinctiveness as phenomena worthy of investigation by quantifying a data point's extremity relative to its group as a measure of statistical significance analogous to the role of Student's t in comparing means. We detail the mathematical formulation of h* with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Advanced Statistical Methods and Models · Statistical Mechanics and Entropy
