Perturbation and scaled Cook's distance
Hongtu Zhu, Joseph G. Ibrahim, Hyunsoon Cho

TL;DR
This paper introduces scaled Cook's distances to fairly compare influence measures across subsets of data with varying sizes in complex models, addressing a key gap in influence diagnostics.
Contribution
It proposes a new measure for the degree of perturbation and develops scaled Cook's distances for better influence analysis in complex data structures.
Findings
Scaled Cook's distances enable fair comparison across subsets.
Theoretical properties support the use of scaled measures.
Numerical examples demonstrate broad applicability.
Abstract
Cook's distance [Technometrics 19 (1977) 15-18] is one of the most important diagnostic tools for detecting influential individual or subsets of observations in linear regression for cross-sectional data. However, for many complex data structures (e.g., longitudinal data), no rigorous approach has been developed to address a fundamental issue: deleting subsets with different numbers of observations introduces different degrees of perturbation to the current model fitted to the data, and the magnitude of Cook's distance is associated with the degree of the perturbation. The aim of this paper is to address this issue in general parametric models with complex data structures. We propose a new quantity for measuring the degree of the perturbation introduced by deleting a subset. We use stochastic ordering to quantify the stochastic relationship between the degree of the perturbation and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
