Leveraged volume sampling for linear regression
Micha{\l} Derezi\'nski, Manfred K. Warmuth, Daniel Hsu

TL;DR
This paper introduces a rescaled volume sampling method for linear regression that achieves near-optimal unbiased estimates with fewer samples, improving previous bounds and offering a new efficient algorithm with broader applications.
Contribution
The paper develops a rescaled volume sampling technique that provides unbiased estimates with improved sample complexity for linear regression.
Findings
Rescaled volume sampling achieves $k=O(d ext{log}d + d/ extepsilon)$ sample size for $(1+ extepsilon)$-approximate loss.
The new method outperforms previous unbiased estimators requiring $k=O(d^2/ extepsilon)$ samples.
Introduces a determinantal rejection sampling algorithm applicable to determinantal point processes.
Abstract
Suppose an design matrix in a linear regression problem is given, but the response for each point is hidden unless explicitly requested. The goal is to sample only a small number of the responses, and then produce a weight vector whose sum of squares loss over all points is at most times the minimum. When is very small (e.g., ), jointly sampling diverse subsets of points is crucial. One such method called volume sampling has a unique and desirable property that the weight vector it produces is an unbiased estimate of the optimum. It is therefore natural to ask if this method offers the optimal unbiased estimate in terms of the number of responses needed to achieve a loss approximation. Surprisingly we show that volume sampling can have poor behavior when we require a very accurate approximation -- indeed worse than some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Point processes and geometric inequalities · Random Matrices and Applications
MethodsLinear Regression
