Representation Debiasing of Generated Data Involving Domain Experts
Aditya Bhattacharya, Simone Stumpf, Katrien Verbert

TL;DR
This paper proposes human-in-the-loop methods involving domain experts to improve data generation processes, aiming to effectively reduce representation bias in AI datasets and enhance model generalisation.
Contribution
It introduces interactive approaches that enable domain experts to guide and validate data augmentation, addressing limitations of existing bias mitigation techniques.
Findings
Enhanced data debiasing through expert-guided augmentation
Improved model generalisation on biased datasets
Framework for collaborative bias mitigation in AI systems
Abstract
Biases in Artificial Intelligence (AI) or Machine Learning (ML) systems due to skewed datasets problematise the application of prediction models in practice. Representation bias is a prevalent form of bias found in the majority of datasets. This bias arises when training data inadequately represents certain segments of the data space, resulting in poor generalisation of prediction models. Despite AI practitioners employing various methods to mitigate representation bias, their effectiveness is often limited due to a lack of thorough domain knowledge. To address this limitation, this paper introduces human-in-the-loop interaction approaches for representation debiasing of generated data involving domain experts. Our work advocates for a controlled data generation process involving domain experts to effectively mitigate the effects of representation bias. We argue that domain experts can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
