Wasserstein Logistic Regression with Mixed Features
Aras Selvi, Mohammad Reza Belbasi, Martin B Haugh, Wolfram, Wiesemann

TL;DR
This paper introduces a novel polynomial-time method for distributionally robust logistic regression that effectively handles mixed numerical and categorical features, outperforming traditional approaches in benchmark tests.
Contribution
It develops a new polynomial-time solution scheme for robust logistic regression with mixed features, which cannot be reformulated as regularized logistic regression, representing a genuine new variant.
Findings
Outperforms unregularized logistic regression on mixed data
Outperforms regularized logistic regression on categorical data
Provides a polynomial-time solution for a complex optimization problem
Abstract
Recent work has leveraged the popular distributionally robust optimization paradigm to combat overfitting in classical logistic regression. While the resulting classification scheme displays a promising performance in numerical experiments, it is inherently limited to numerical features. In this paper, we show that distributionally robust logistic regression with mixed (i.e., numerical and categorical) features, despite amounting to an optimization problem of exponential size, admits a polynomial-time solution scheme. We subsequently develop a practically efficient column-and-constraint approach that solves the problem as a sequence of polynomial-time solvable exponential conic programs. Our model retains many of the desirable theoretical features of previous works, but -- in contrast to the literature -- it does not admit an equivalent representation as a regularized logistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRisk and Portfolio Optimization · Multi-Criteria Decision Making · Fuzzy Systems and Optimization
