A Statistical Perspective on Algorithmic Leveraging
Ping Ma, Michael W. Mahoney, Bin Yu

TL;DR
This paper evaluates the statistical properties of algorithmic leveraging in linear regression, revealing that leverage-based and uniform sampling have comparable bias and variance, and introduces two improved algorithms with empirical validation.
Contribution
It provides a statistical framework for analyzing leverage-based sampling, showing its bias and variance are comparable to uniform sampling, and proposes two new leveraging algorithms with empirical performance improvements.
Findings
Leverage-based and uniform sampling have similar bias and variance in linear regression.
Theoretical results predict practical performance of leverage-based algorithms.
New algorithms outperform existing methods in empirical tests.
Abstract
One popular method for dealing with large-scale data sets is sampling. For example, by using the empirical statistical leverage scores as an importance sampling distribution, the method of algorithmic leveraging samples and rescales rows/columns of data matrices to reduce the data size before performing computations on the subproblem. This method has been successful in improving computational efficiency of algorithms for matrix problems such as least-squares approximation, least absolute deviations approximation, and low-rank matrix approximation. Existing work has focused on algorithmic issues such as worst-case running times and numerical issues associated with providing high-quality implementations, but none of it addresses statistical aspects of this method. In this paper, we provide a simple yet effective framework to evaluate the statistical properties of algorithmic leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Face and Expression Recognition
MethodsLinear Regression
