Optimal Risk Scores for Continuous Predictors

Cristina Molero-R\'io; Claudia D'Ambrosio

arXiv:2502.08588·math.OC·February 13, 2025·LOD

Optimal Risk Scores for Continuous Predictors

Cristina Molero-R\'io, Claudia D'Ambrosio

PDF

1 Repo

TL;DR

This paper introduces a new optimization method for creating risk scores that directly handle continuous predictors by selecting optimal thresholds, improving upon previous discretization-based approaches.

Contribution

It presents a novel mixed-integer nonlinear optimization framework that allows continuous predictors to determine their own thresholds, unlike prior methods that rely on arbitrary discretization.

Findings

01

Effective in synthetic datasets

02

Outperforms discretization-based methods

03

Enables direct threshold selection for continuous variables

Abstract

In this paper, we propose a novel Mixed-Integer Non-Linear Optimization formulation to construct a risk score, where we optimize the logistic loss with sparsity constraints. Previous approaches are typically designed to handle binary datasets, where continuous predictor variables are discretized in a preprocessing step by using arbitrary thresholds, such as quantiles. In contrast, we allow the model to decide for each continuous predictor variable the particular threshold that is critical for prediction. The usefulness of the resulting optimization problem is tested in synthetic datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mmolerous/Risk-Scores
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.