Computing Robust Leverage Diagnostics when the Design Matrix Contains Coded Categorical Variables
Kjell Konis

TL;DR
This paper proposes a new robust leverage diagnostic method for linear regression models with categorical variables, addressing issues with sparse design matrices that hinder traditional robust estimation techniques.
Contribution
It introduces a hybrid approach combining robust analysis of continuous predictors with classical leverage measures, suitable for sparse design matrices with categorical variables.
Findings
The method effectively identifies leverage points in models with categorical predictors.
It overcomes computational issues caused by singular matrices in robust estimation.
The approach is applicable to real-world datasets with mixed predictor types.
Abstract
For a robust leverage diagnostic in linear regression, Rousseeuw and van Zomeren [1990] proposed using robust distance (Mahalanobis distance computed using robust estimates of location and covariance). However, a design matrix X that contains coded categorical predictor variables is often sufficiently sparse that robust estimates of location and covariance cannot be computed. Specifically, matrices formed by taking subsets of the rows of X are likely to be singular, causing algorithms that rely on subsampling to fail. Following the spirit of Maronna and Yohai [2000], we observe that extreme leverage points are extreme in the continuous predictor variables. We therefore propose a robust leverage diagnostic that combines a robust analysis of the continuous predictor variables and the classical definition of leverage.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Optimal Experimental Design Methods · Advanced Statistical Process Monitoring
