Supersparse Linear Integer Models for Optimized Medical Scoring Systems
Berk Ustun, Cynthia Rudin

TL;DR
This paper introduces SLIM, a novel method for creating sparse, accurate, and operationally constrained linear scoring systems for medical diagnosis, optimized through integer programming.
Contribution
SLIM is a new data-driven approach that directly encodes accuracy and sparsity constraints, producing tailored scoring systems without extensive parameter tuning.
Findings
SLIM achieves high accuracy and sparsity in medical scoring tasks.
The method effectively incorporates operational constraints.
SLIM's scalability is improved with a new data reduction technique.
Abstract
Scoring systems are linear classification models that only require users to add, subtract and multiply a few small numbers in order to make a prediction. These models are in widespread use by the medical community, but are difficult to learn from data because they need to be accurate and sparse, have coprime integer coefficients, and satisfy multiple operational constraints. We present a new method for creating data-driven scoring systems called a Supersparse Linear Integer Model (SLIM). SLIM scoring systems are built by solving an integer program that directly encodes measures of accuracy (the 0-1 loss) and sparsity (the -seminorm) while restricting coefficients to coprime integers. SLIM can seamlessly incorporate a wide range of operational constraints related to accuracy and sparsity, and can produce highly tailored models without parameter tuning. We provide bounds on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Machine Learning in Healthcare · Statistical Methods and Inference
