Best Subset Selection via a Modern Optimization Lens
Dimitris Bertsimas, Angela King, Rahul Mazumder

TL;DR
This paper introduces a mixed integer optimization approach for best subset selection in linear regression, achieving high-quality solutions efficiently and outperforming traditional methods like Lasso in predictive accuracy and sparsity.
Contribution
It develops a novel MIO-based algorithm that guarantees near-optimal solutions with early stopping, handles side constraints, and extends to absolute deviation loss functions.
Findings
Solves subset selection problems with thousands of features in minutes.
Provides solutions with provable suboptimality guarantees.
Outperforms Lasso and similar methods in predictive power and sparsity.
Abstract
In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving Mixed Integer Optimization (MIO) problems. We present a MIO approach for solving the classical best subset selection problem of choosing out of features in linear regression given observations. We develop a discrete extension of modern first order continuous optimization methods to find high quality feasible solutions that we use as warm starts to a MIO solver that finds provably optimal solutions. The resulting algorithm (a) provides a solution with a guarantee on its suboptimality even if we terminate the algorithm early, (b) can accommodate side constraints on the coefficients of the linear regression and (c) extends to finding best subset solutions for the least absolute deviation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Regression
