Linear Regression with Sparsely Permuted Data
Martin Slawski, Emanuel Ben-David

TL;DR
This paper addresses the challenge of linear regression with data where only a small fraction of response-predictor pairs are mismatched, proposing a robust regression approach to estimate parameters and recover permutations efficiently.
Contribution
It introduces a robust regression method for sparsely permuted data, enabling consistent parameter estimation and permutation recovery with computational simplicity.
Findings
Robust regression effectively estimates parameters despite sparse mismatches.
The method can recover the permutation structure accurately.
Proposed approach is computationally efficient and statistically sound.
Abstract
In regression analysis of multivariate data, it is tacitly assumed that response and predictor variables in each observed response-predictor pair correspond to the same entity or unit. In this paper, we consider the situation of "permuted data" in which this basic correspondence has been lost. Several recent papers have considered this situation without further assumptions on the underlying permutation. In applications, the latter is often to known to have additional structure that can be leveraged. Specifically, we herein consider the common scenario of "sparsely permuted data" in which only a small fraction of the data is affected by a mismatch between response and predictors. However, an adverse effect already observed for sparsely permuted data is that the least squares estimator as well as other estimators not accounting for such partial mismatch are inconsistent. One approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
