Sharp Asymptotics and Optimal Performance for Inference in Binary Models
Hossein Taheri, Ramtin Pedarsani, and Christos Thrampoulidis

TL;DR
This paper provides precise predictions of the statistical performance of convex empirical risk minimization in high-dimensional binary models, demonstrating near-optimality of least-squares and validating results through simulations.
Contribution
It offers sharp asymptotic performance predictions for convex estimators in binary models and constructs loss functions that achieve these bounds, showing near-optimality of least-squares.
Findings
Performance predictions hold for a wide class of convex loss functions.
Least-squares performs within 0.98-0.997 of the optimal in binary classification.
Simulations confirm the theoretical predictions are accurate even at small dimensions.
Abstract
We study convex empirical risk minimization for high-dimensional inference in binary models. Our first result sharply predicts the statistical performance of such estimators in the linear asymptotic regime under isotropic Gaussian features. Importantly, the predictions hold for a wide class of convex loss functions, which we exploit in order to prove a bound on the best achievable performance among them. Notably, we show that the proposed bound is tight for popular binary models (such as Signed, Logistic or Probit), by constructing appropriate loss functions that achieve it. More interestingly, for binary linear classification under the Logistic and Probit models, we prove that the performance of least-squares is no worse than 0.997 and 0.98 times the optimal one. Numerical simulations corroborate our theoretical findings and suggest they are accurate even for relatively small problem…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Machine Learning and Algorithms
