One Model Many Scores: Using Multiverse Analysis to Prevent Fairness Hacking and Evaluate the Influence of Model Design Decisions
Jan Simson, Florian Pfisterer, Christoph Kern

TL;DR
This paper introduces multiverse analysis to explicitly examine how various design and evaluation decisions impact fairness metrics in algorithmic decision-making systems, highlighting potential vulnerabilities and robustness issues.
Contribution
It presents a novel application of multiverse analysis to fairness in ADM, turning implicit decisions into explicit variables to assess their effects on fairness outcomes.
Findings
Fairness metrics vary significantly across different decision combinations.
Evaluation decisions can be exploited to falsely portray models as fair.
Multiverse analysis reveals robustness or fragility of fairness assessments.
Abstract
A vast number of systems across the world use algorithmic decision making (ADM) to (partially) automate decisions that have previously been made by humans. The downstream effects of ADM systems critically depend on the decisions made during a systems' design, implementation, and evaluation, as biases in data can be mitigated or reinforced along the modeling pipeline. Many of these decisions are made implicitly, without knowing exactly how they will influence the final system. To study this issue, we draw on insights from the field of psychology and introduce the method of multiverse analysis for algorithmic fairness. In our proposed method, we turn implicit decisions during design and evaluation into explicit ones and demonstrate their fairness implications. By combining decisions, we create a grid of all possible "universes" of decision combinations. For each of these universes, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Health Systems, Economic Evaluations, Quality of Life
