Predictive Multiplicity in Classification
Charles T. Marx, Flavio du Pin Calmon, Berk Ustun

TL;DR
This paper defines predictive multiplicity in classification, introduces measures to evaluate it, and provides tools to compute it exactly, revealing that real-world datasets often admit highly conflicting models.
Contribution
It formalizes the concept of predictive multiplicity, develops integer programming methods to measure it, and demonstrates its significance in real-world recidivism prediction datasets.
Findings
Real-world datasets exhibit high predictive multiplicity.
Competing models can assign conflicting predictions.
Tools enable exact measurement of multiplicity in linear classifiers.
Abstract
Prediction problems often admit competing models that perform almost equally well. This effect challenges key assumptions in machine learning when competing models assign conflicting predictions. In this paper, we define predictive multiplicity as the ability of a prediction problem to admit competing models with conflicting predictions. We introduce formal measures to evaluate the severity of predictive multiplicity and develop integer programming tools to compute them exactly for linear classification problems. We apply our tools to measure predictive multiplicity in recidivism prediction problems. Our results show that real-world datasets may admit competing models that assign wildly conflicting predictions, and motivate the need to measure and report predictive multiplicity in model development.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsBayesian Modeling and Causal Inference · Imbalanced Data Classification Techniques · Computability, Logic, AI Algorithms
