Uniform Sampling for Matrix Approximation

Michael B. Cohen; Yin Tat Lee; Cameron Musco; Christopher Musco,; Richard Peng; Aaron Sidford

arXiv:1408.5099·cs.DS·August 22, 2014

Uniform Sampling for Matrix Approximation

Michael B. Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco,, Richard Peng, Aaron Sidford

PDF

TL;DR

This paper investigates the effectiveness of uniform row sampling for matrix approximation, demonstrating it preserves enough information for improved approximations and introducing efficient algorithms that maintain sparsity.

Contribution

It provides new insights into uniform sampling's capabilities, showing it can be used effectively for matrix approximation and introduces input-sparsity time algorithms that reweight rows to reduce coherence.

Findings

01

Uniform sampling preserves a large fraction of the original matrix's information.

02

Reweighting a small subset of rows can make any matrix have low coherence.

03

New algorithms run in input-sparsity time and maintain row sparsity.

Abstract

Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time significantly. For theoretical performance guarantees, each row must be sampled with probability proportional to its statistical leverage score. Unfortunately, leverage scores are difficult to compute. A simple alternative is to sample rows uniformly at random. While this often works, uniform sampling will eliminate critical row information for many natural instances. We take a fresh look at uniform sampling by examining what information it does preserve. Specifically, we show that uniform sampling yields a matrix that, in some sense, well approximates a large fraction of the original. While this weak form of approximation is not enough for solving linear…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.