TL;DR
This paper introduces a new optimization method for creating risk scores that directly handle continuous predictors by selecting optimal thresholds, improving upon previous discretization-based approaches.
Contribution
It presents a novel mixed-integer nonlinear optimization framework that allows continuous predictors to determine their own thresholds, unlike prior methods that rely on arbitrary discretization.
Findings
Effective in synthetic datasets
Outperforms discretization-based methods
Enables direct threshold selection for continuous variables
Abstract
In this paper, we propose a novel Mixed-Integer Non-Linear Optimization formulation to construct a risk score, where we optimize the logistic loss with sparsity constraints. Previous approaches are typically designed to handle binary datasets, where continuous predictor variables are discretized in a preprocessing step by using arbitrary thresholds, such as quantiles. In contrast, we allow the model to decide for each continuous predictor variable the particular threshold that is critical for prediction. The usefulness of the resulting optimization problem is tested in synthetic datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
