Perturbation and scaled Cook's distance

Hongtu Zhu; Joseph G. Ibrahim; Hyunsoon Cho

arXiv:1204.0064·stat.ME·June 8, 2012

Perturbation and scaled Cook's distance

Hongtu Zhu, Joseph G. Ibrahim, Hyunsoon Cho

PDF

TL;DR

This paper introduces scaled Cook's distances to fairly compare influence measures across subsets of data with varying sizes in complex models, addressing a key gap in influence diagnostics.

Contribution

It proposes a new measure for the degree of perturbation and develops scaled Cook's distances for better influence analysis in complex data structures.

Findings

01

Scaled Cook's distances enable fair comparison across subsets.

02

Theoretical properties support the use of scaled measures.

03

Numerical examples demonstrate broad applicability.

Abstract

Cook's distance [Technometrics 19 (1977) 15-18] is one of the most important diagnostic tools for detecting influential individual or subsets of observations in linear regression for cross-sectional data. However, for many complex data structures (e.g., longitudinal data), no rigorous approach has been developed to address a fundamental issue: deleting subsets with different numbers of observations introduces different degrees of perturbation to the current model fitted to the data, and the magnitude of Cook's distance is associated with the degree of the perturbation. The aim of this paper is to address this issue in general parametric models with complex data structures. We propose a new quantity for measuring the degree of the perturbation introduced by deleting a subset. We use stochastic ordering to quantify the stochastic relationship between the degree of the perturbation and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.