Robust normality transformation for outlier detection in diverse distributions, with application to functional neuroimaging data

Saranjeet Singh Saluja; Fatma Parlak; Amanda Mejia

arXiv:2505.11806·stat.ME·November 19, 2025

Robust normality transformation for outlier detection in diverse distributions, with application to functional neuroimaging data

Saranjeet Singh Saluja, Fatma Parlak, Amanda Mejia

PDF

TL;DR

This paper introduces a robust, flexible transformation method based on the SHASH family to improve outlier detection across diverse distributions, especially in neuroimaging data, outperforming existing techniques.

Contribution

A novel robust transformation approach using SHASH distributions that handles skew and tail variations, enhancing outlier detection in complex data.

Findings

01

SHASH transformation outperforms existing methods in simulations

02

High sensitivity to outliers even with 20-30% contamination

03

Effective noise reduction in neuroimaging data

Abstract

Automatic detection of statistical outliers is facilitated through knowledge of the source distribution of regular observations. Since the population distribution is often unknown in practice, one approach is to apply a transformation to Normality. However, the efficacy of transformation is hindered by the presence of outliers, which can have an outsized influence on transformation parameter(s) and lead to masking of outliers post-transformation. Robust Box-Cox and Yeo-Johnson transformations have been proposed but those transformations are only equipped to deal with skew. Here, we develop a novel robust method for transformation to Normality based on the highly flexible sinh-arcsinh (SHASH) family of distributions, which can accommodate skew, non-Gaussian tail weights, and combinations of both. A critical step is initializing outliers, given their potential influence on the highly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.