Uniform Sampling for Matrix Approximation
Michael B. Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco,, Richard Peng, Aaron Sidford

TL;DR
This paper investigates the effectiveness of uniform row sampling for matrix approximation, demonstrating it preserves enough information for improved approximations and introducing efficient algorithms that maintain sparsity.
Contribution
It provides new insights into uniform sampling's capabilities, showing it can be used effectively for matrix approximation and introduces input-sparsity time algorithms that reweight rows to reduce coherence.
Findings
Uniform sampling preserves a large fraction of the original matrix's information.
Reweighting a small subset of rows can make any matrix have low coherence.
New algorithms run in input-sparsity time and maintain row sparsity.
Abstract
Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time significantly. For theoretical performance guarantees, each row must be sampled with probability proportional to its statistical leverage score. Unfortunately, leverage scores are difficult to compute. A simple alternative is to sample rows uniformly at random. While this often works, uniform sampling will eliminate critical row information for many natural instances. We take a fresh look at uniform sampling by examining what information it does preserve. Specifically, we show that uniform sampling yields a matrix that, in some sense, well approximates a large fraction of the original. While this weak form of approximation is not enough for solving linear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
