Nonsingular subsampling for S-estimators with categorical predictors
Manuel Koller

TL;DR
This paper introduces a nonsingular subsampling method for S-estimators in linear regression, improving efficiency especially with categorical predictors by avoiding singular samples during initial coefficient estimation.
Contribution
The paper proposes a novel nonsingular subsampling algorithm that enhances the speed and reliability of robust linear regression with categorical predictors.
Findings
Faster subsampling with categorical predictors.
Comparable speed to random subsampling with continuous predictors.
Reduces computational issues caused by singular samples.
Abstract
An integral part of many algorithms for S-estimators of linear regression is random subsampling. For problems with only continuous predictors simple random subsampling is a reliable method to generate initial coefficient estimates that can then be further refined. For data with categorical predictors, however, random subsampling often does not work, thus limiting the use of an otherwise fine estimator. This also makes the choice of estimator for robust linear regression dependent on the type of predictors, which is an unnecessary nuisance in practice. For data with categorical predictors random subsampling often generates singular subsamples. Since these subsamples cannot be used to calculate coefficient estimates, they have to be discarded. This makes random subsampling slow, especially if some levels of categorical predictors have low frequency, and renders the algorithms infeasible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Advanced Statistical Process Monitoring · Bayesian Methods and Mixture Models
