Robust regression estimation and inference in the presence of cellwise and casewise contamination
Andy Leung, Hongyang Zhang, Ruben H. Zamar

TL;DR
This paper introduces a three-step robust regression method that effectively detects and mitigates both cellwise and casewise outliers, ensuring reliable inference in high-dimensional contaminated data.
Contribution
It proposes a novel three-step estimator combining univariate filtering and multivariate robust estimation, improving robustness against complex outlier patterns.
Findings
The estimator is consistent and asymptotically normal under certain conditions.
It effectively detects and down-weights cellwise and casewise outliers.
Simulation results demonstrate high resilience to various contamination types.
Abstract
Cellwise outliers are likely to occur together with casewise outliers in modern data sets with relatively large dimension. Recent work has shown that traditional robust regression methods may fail for data sets in this paradigm. The proposed method, called three-step regression, proceeds as follows: first, it uses a consistent univariate filter to detect and eliminate extreme cellwise outliers; second, it applies a robust estimator of multivariate location and scatter to the filtered data to down-weight casewise outliers; third, it computes robust regression coefficients from the estimates obtained in the second step. The three-step estimator is shown to be consistent and asymptotically normal at the central model under some assumptions on the tail distributions of the continuous covariates. The estimator is extended to handle both numerical and dummy covariates using an iterative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
