High-Dimensional Regression with Binary Coefficients. Estimating Squared   Error and a Phase Transition

David Gamarnik; Ilias Zadik

arXiv:1701.04455·stat.ML·September 26, 2019·5 cites

High-Dimensional Regression with Binary Coefficients. Estimating Squared Error and a Phase Transition

David Gamarnik, Ilias Zadik

PDF

Open Access

TL;DR

This paper analyzes the phase transition and structural properties of high-dimensional sparse linear regression with binary coefficients, providing precise error bounds, thresholds for recovery, and conjectures on computational hardness.

Contribution

It introduces a novel conditional second moment method to approximate the optimal squared error and characterizes phase transitions and the Overlap Gap Property in binary sparse regression.

Findings

01

Identifies a sharp phase transition at n^*=2k log p / log (2k/σ^2 + 1).

02

Shows the existence of an all-or-nothing recovery property around n^*.

03

Conjectures that the Overlap Gap Property indicates algorithmic hardness below certain sample sizes.

Abstract

We consider a sparse linear regression model Y=X\beta^{*}+W where X has a Gaussian entries, W is the noise vector with mean zero Gaussian entries, and \beta^{*} is a binary vector with support size (sparsity) k. Using a novel conditional second moment method we obtain a tight up to a multiplicative constant approximation of the optimal squared error \min_{\beta}\|Y-X\beta\|_{2}, where the minimization is over all k-sparse binary vectors \beta. The approximation reveals interesting structural properties of the underlying regression problem. In particular, a) We establish that n^*=2k\log p/\log (2k/\sigma^{2}+1) is a phase transition point with the following "all-or-nothing" property. When n exceeds n^{*}, (2k)^{-1}\|\beta_{2}-\beta^*\|_0\approx 0, and when n is below n^{*}, (2k)^{-1}\|\beta_{2}-\beta^*\|_0\approx 1, where \beta_2 is the optimal solution achieving the smallest squared…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Markov Chains and Monte Carlo Methods

MethodsLinear Regression