Fairness Evaluation with Item Response Theory
Ziqi Xu, Sevvandi Kandanaarachchi, Cheng Soon Ong, Eirini Ntoutsi

TL;DR
This paper introduces a novel Fair-IRT framework that applies Item Response Theory to evaluate fairness in machine learning models by analyzing individual and model parameters, with experiments demonstrating its effectiveness.
Contribution
The paper presents the first application of IRT in fairness evaluation of ML models, introducing parameters for fairness, discrimination, and difficulty, along with a new ICC flatness measure.
Findings
Effective fairness evaluation demonstrated in experiments
Application to classification and regression tasks
Potential for assessing model inclusivity and equity
Abstract
Item Response Theory (IRT) has been widely used in educational psychometrics to assess student ability, as well as the difficulty and discrimination of test questions. In this context, discrimination specifically refers to how effectively a question distinguishes between students of different ability levels, and it does not carry any connotation related to fairness. In recent years, IRT has been successfully used to evaluate the predictive performance of Machine Learning (ML) models, but this paper marks its first application in fairness evaluation. In this paper, we propose a novel Fair-IRT framework to evaluate a set of predictive models on a set of individuals, while simultaneously eliciting specific parameters, namely, the ability to make fair predictions (a feature of predictive models), as well as the discrimination and difficulty of individuals that affect the prediction results.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQualitative Comparative Analysis Research
MethodsSparse Evolutionary Training
