Statistically Significant Linear Regression Coefficients Solely Driven By Outliers In Finite-sample Inference

Felix Reichel

arXiv:2505.10738·stat.ME·May 21, 2025

Statistically Significant Linear Regression Coefficients Solely Driven By Outliers In Finite-sample Inference

Felix Reichel

PDF

Open Access

TL;DR

This paper reveals that outliers can falsely indicate statistical significance in linear regression coefficients, emphasizing the importance of diagnostics and robust methods for accurate inference.

Contribution

It demonstrates how a single outlier can distort significance testing in linear regression and compares traditional and robust methods to mitigate this effect.

Findings

01

Outliers can falsely suggest significance in regression coefficients.

02

Robust Huber regression reduces outlier influence.

03

Diagnostic tools help identify influential outliers.

Abstract

In this paper, we investigate the impact of outliers on the statistical significance of coefficients in linear regression. We demonstrate, through numerical simulation using R, that a single outlier can cause an otherwise insignificant coefficient to appear statistically significant. We compare this with robust Huber regression, which reduces the effects of outliers. Afterwards, we approximate the influence of a single outlier on estimated regression coefficients and discuss common diagnostic statistics to detect influential observations in regression (e.g., studentized residuals). Furthermore, we relate this issue to the optional normality assumption in simple linear regression [14], required for exact finite-sample inference but asymptotically justified for large n by the Central Limit Theorem (CLT). We also address the general dangers of relying solely on p-values without performing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Anomaly Detection Techniques and Applications · Statistical Mechanics and Entropy