A First Look at Fairness of Machine Learning Based Code Reviewer Recommendation
Mohammad Mahdi Mohajer, Alvine Boaye Belle, Nima Shiri harzevili,, Junjie Wang, Hadi Hemmati, Song Wang, Zhen Ming (Jack) Jiang

TL;DR
This paper investigates the fairness of machine learning models in code reviewer recommendation, revealing existing biases, analyzing their causes, and proposing solutions to improve fairness, especially in imbalanced datasets.
Contribution
First empirical study on ML fairness in software engineering, specifically in code reviewer recommendation, identifying biases and proposing mitigation strategies.
Findings
ML-based reviewer recommendation systems are unfair, favoring male reviewers.
Current mitigation methods can double fairness in balanced datasets.
Effectiveness of mitigation methods is limited on imbalanced or skewed data.
Abstract
The fairness of machine learning (ML) approaches is critical to the reliability of modern artificial intelligence systems. Despite extensive study on this topic, the fairness of ML models in the software engineering (SE) domain has not been well explored yet. As a result, many ML-powered software systems, particularly those utilized in the software engineering community, continue to be prone to fairness issues. Taking one of the typical SE tasks, i.e., code reviewer recommendation, as a subject, this paper conducts the first study toward investigating the issue of fairness of ML applications in the SE domain. Our empirical study demonstrates that current state-of-the-art ML-based code reviewer recommendation techniques exhibit unfairness and discriminating behaviors. Specifically, male reviewers get on average 7.25% more recommendations than female code reviewers compared to their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Mobile Crowdsensing and Crowdsourcing
