Study of the influence of a biased database on the prediction of standard algorithms for selecting the best candidate for an interview
Shuyu Wang, Ang\'elique Saillet, Philom\`ene Le Gall, Alain, Lacroux, Christelle Martin-Lacroux, Vincent Brault

TL;DR
This study investigates how biased training data impacts the effectiveness of standard AI algorithms in selecting the best candidates for interviews, highlighting the effects of biases and anonymization on prediction quality.
Contribution
It introduces a method to generate biased datasets and evaluates their impact on classic algorithms' ability to identify optimal candidates.
Findings
Bias in training data reduces algorithm accuracy
Anonymization affects prediction quality
Bias type influences algorithm performance
Abstract
Artificial intelligence is used at various stages of the recruitment process to automatically select the best candidate for a position, with companies guaranteeing unbiased recruitment. However, the algorithms used are either trained by humans or are based on learning from past experiences that were biased. In this article, we propose to generate data mimicking external (discrimination) and internal biases (self-censorship) in order to train five classic algorithms and to study the extent to which they do or do not find the best candidates according to objective criteria. In addition, we study the influence of the anonymisation of files on the quality of predictions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForecasting Techniques and Applications · Advanced Statistical Methods and Models
