Outlier detection in regression: conic quadratic formulations
Andr\'es G\'omez, Jos\'e Neto

TL;DR
This paper introduces stronger conic quadratic formulations for outlier detection in linear regression, avoiding big-M constraints and significantly improving computational efficiency over existing methods.
Contribution
It develops novel second-order conic relaxations for outlier detection in regression, outperforming traditional big-M linearization approaches.
Findings
Proposed formulations are several orders-of-magnitude faster.
New relaxations improve solution quality and computational speed.
Method effectively handles outliers in regression models.
Abstract
In many applications, when building linear regression models, it is important to account for the presence of outliers, i.e., corrupted input data points. Such problems can be formulated as mixed-integer optimization problems involving cubic terms, each given by the product of a binary variable and a quadratic term of the continuous variables. Existing approaches in the literature, typically relying on the linearization of the cubic terms using big-M constraints, suffer from weak relaxation and poor performance in practice. In this work we derive stronger second-order conic relaxations that do not involve big-M constraints. Our computational experiments indicate that the proposed formulations are several orders-of-magnitude faster than existing big-M formulations in the literature for this problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Fault Detection and Control Systems · Fuzzy Systems and Optimization
MethodsLinear Regression
