Learning with Sparsely Permuted Data: A Robust Bayesian Approach
Abhisek Chakraborty, Saptati Datta

TL;DR
This paper introduces a robust Bayesian method for regression with data where predictor or response identifiers are permuted, providing theoretical guarantees and efficient sampling techniques for handling sparsely permuted data.
Contribution
It presents a novel generalized Bayesian framework and sampling scheme for sparse permutation problems, with theoretical guarantees and practical efficiency improvements.
Findings
Effective posterior sampling scheme developed
Theoretical posterior contraction guarantees established
Demonstrated superior performance in numerical experiments
Abstract
Data dispersed across multiple files are commonly integrated through probabilistic linkage methods, where even minimal error rates in record matching can significantly contaminate subsequent statistical analyses. In regression problems, we examine scenarios where the identifiers of predictors or responses are subject to an unknown permutation, challenging the assumption of correspondence. Many emerging approaches in the literature focus on sparsely permuted data, where only a small subset of pairs () are affected by the permutation, treating these permuted entries as outliers to restore original correspondence and obtain consistent estimates of regression parameters. In this article, we complement the existing literature by introducing a novel generalized robust Bayesian formulation of the problem. We develop an efficient posterior sampling scheme by adapting the fractional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
