Study of the influence of a biased database on the prediction of   standard algorithms for selecting the best candidate for an interview

Shuyu Wang; Ang\'elique Saillet; Philom\`ene Le Gall; Alain; Lacroux; Christelle Martin-Lacroux; Vincent Brault

arXiv:2505.02609·cs.AI·May 6, 2025

Study of the influence of a biased database on the prediction of standard algorithms for selecting the best candidate for an interview

Shuyu Wang, Ang\'elique Saillet, Philom\`ene Le Gall, Alain, Lacroux, Christelle Martin-Lacroux, Vincent Brault

PDF

Open Access

TL;DR

This study investigates how biased training data impacts the effectiveness of standard AI algorithms in selecting the best candidates for interviews, highlighting the effects of biases and anonymization on prediction quality.

Contribution

It introduces a method to generate biased datasets and evaluates their impact on classic algorithms' ability to identify optimal candidates.

Findings

01

Bias in training data reduces algorithm accuracy

02

Anonymization affects prediction quality

03

Bias type influences algorithm performance

Abstract

Artificial intelligence is used at various stages of the recruitment process to automatically select the best candidate for a position, with companies guaranteeing unbiased recruitment. However, the algorithms used are either trained by humans or are based on learning from past experiences that were biased. In this article, we propose to generate data mimicking external (discrimination) and internal biases (self-censorship) in order to train five classic algorithms and to study the extent to which they do or do not find the best candidates according to objective criteria. In addition, we study the influence of the anonymisation of files on the quality of predictions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForecasting Techniques and Applications · Advanced Statistical Methods and Models