Causal Synthetic Data Generation in Recruitment
Andrea Iommi, Antonio Mastropietro, Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri

TL;DR
This paper introduces a causal generative approach to create synthetic recruitment data that preserves causal relationships, aiming to improve fairness and transparency in candidate ranking models under privacy constraints.
Contribution
The study develops a novel SDG method using two domain-informed causal generative models to produce fair, synthetic recruitment datasets for evaluating bias in ranking algorithms.
Findings
Synthetic data preserves causal relationships.
Controlled bias scenarios demonstrate model fairness.
Enhanced data privacy with realistic data generation.
Abstract
The importance of Synthetic Data Generation (SDG) has increased significantly in domains where data quality is poor or access is limited due to privacy and regulatory constraints. One such domain is recruitment, where publicly available datasets are scarce due to the sensitive nature of information typically found in curricula vitae, such as gender, disability status, or age. This lack of accessible, representative data presents a significant obstacle to the development of fair and transparent machine learning models, particularly ranking algorithms that require large volumes of data to effectively learn how to recommend candidates. In the absence of such data, these models are prone to poor generalisation and may fail to perform reliably in real-world scenarios. Recent advances in Causal Generative Models (CGMs) offer a promising solution. CGMs enable the generation of synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Advanced Causal Inference Techniques · Machine Learning in Healthcare
