Detection of Multiple Influential Observations on Model Selection

Dongliang Zhang; Masoud Asgharian; Martin A. Lindquist

arXiv:2412.02945·stat.ME·March 17, 2026

Detection of Multiple Influential Observations on Model Selection

Dongliang Zhang, Masoud Asgharian, Martin A. Lindquist

PDF

Open Access

TL;DR

This paper develops a new statistical framework for detecting influential outliers in high-dimensional models, including logistic regression, with applications to fMRI data, improving model robustness and reproducibility.

Contribution

It introduces a theoretically grounded approach for identifying influential observations affecting model selection in high-dimensional settings, extending existing diagnostics.

Findings

01

New asymptotic distribution derived for the diagnostic measure

02

Effective detection of influential outliers in linear and logistic models

03

Application to fMRI data reveals previously undetected influential observations

Abstract

Outlying observations are frequently encountered across a wide spectrum of scientific domains, posing notable challenges to the generalizability of statistical models and the reproducibility of downstream analysis. They are identified through influential diagnostics, which aim to capture observations that unduly bias model estimation. To date, methods for identifying observations that influence the selection of a stochastically chosen submodel have been underdeveloped, especially in the high-dimensional setting where the number of predictors $p$ exceeds the sample size $n$ . Recently we proposed an improved diagnostic measure to handle this setting. However, its distributional properties and approximations have not yet been explored. To address this shortcoming, we revisit the notion of exchangeability to determine the exact asymptotic distribution of our assessment measure. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems