The Disparate Benefits of Deep Ensembles
Kajetan Schweighofer, Adrian Arnaiz-Rodriguez, Sepp Hochreiter, Nuria Oliver

TL;DR
Deep Ensembles improve predictive performance but can unevenly benefit different social groups, affecting fairness metrics, and can be mitigated with post-processing methods.
Contribution
This work uncovers the disparate benefits effect of Deep Ensembles on fairness and proposes mitigation via Hardt post-processing.
Findings
Deep Ensembles unevenly favor different groups across fairness metrics.
Per-group diversity in ensemble predictions explains the disparate benefits.
Hardt post-processing effectively mitigates the fairness disparities.
Abstract
Ensembles of Deep Neural Networks, Deep Ensembles, are widely used as a simple way to boost predictive performance. However, their impact on algorithmic fairness is not well understood yet. Algorithmic fairness examines how a model's performance varies across socially relevant groups defined by protected attributes such as age, gender, or race. In this work, we explore the interplay between the performance gains from Deep Ensembles and fairness. Our analysis reveals that they unevenly favor different groups, a phenomenon that we term the disparate benefits effect. We empirically investigate this effect using popular facial analysis and medical imaging datasets with protected group attributes and find that it affects multiple established group fairness metrics, including statistical parity and equal opportunity. Furthermore, we identify that the per-group differences in predictive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpinion Dynamics and Social Influence
MethodsDeep Ensembles
