Optimal Shrinkage of Singular Values Under Random Data Contamination
Danny Barash, Matan Gavish

TL;DR
This paper introduces a unified framework for handling various data contamination models in low-rank matrix reconstruction, developing an optimal algorithm based on singular value manipulation and identifying a fundamental signal-to-noise threshold.
Contribution
It presents an asymptotically optimal method for low-rank matrix recovery under diverse contamination models using singular value adjustments.
Findings
Unified framework for contamination models
Optimal singular value-based reconstruction algorithm
Explicit signal-to-noise cutoff for successful estimation
Abstract
A low rank matrix X has been contaminated by uniformly distributed noise, missing values, outliers and corrupt entries. Reconstruction of X from the singular values and singular vectors of the contaminated matrix Y is a key problem in machine learning, computer vision and data science. In this paper we show that common contamination models (including arbitrary combinations of uniform noise,missing values, outliers and corrupt entries) can be described efficiently using a single framework. We develop an asymptotically optimal algorithm that estimates X by manipulation of the singular values of Y , which applies to any of the contamination models considered. Finally, we find an explicit signal-to-noise cutoff, below which estimation of X from the singular value decomposition of Y must fail, in a well-defined sense.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Random Matrices and Applications · Stochastic Gradient Optimization Techniques
