Representation Debiasing of Generated Data Involving Domain Experts

Aditya Bhattacharya; Simone Stumpf; Katrien Verbert

arXiv:2407.09485·cs.HC·July 16, 2024

Representation Debiasing of Generated Data Involving Domain Experts

Aditya Bhattacharya, Simone Stumpf, Katrien Verbert

PDF

TL;DR

This paper proposes human-in-the-loop methods involving domain experts to improve data generation processes, aiming to effectively reduce representation bias in AI datasets and enhance model generalisation.

Contribution

It introduces interactive approaches that enable domain experts to guide and validate data augmentation, addressing limitations of existing bias mitigation techniques.

Findings

01

Enhanced data debiasing through expert-guided augmentation

02

Improved model generalisation on biased datasets

03

Framework for collaborative bias mitigation in AI systems

Abstract

Biases in Artificial Intelligence (AI) or Machine Learning (ML) systems due to skewed datasets problematise the application of prediction models in practice. Representation bias is a prevalent form of bias found in the majority of datasets. This bias arises when training data inadequately represents certain segments of the data space, resulting in poor generalisation of prediction models. Despite AI practitioners employing various methods to mitigate representation bias, their effectiveness is often limited due to a lack of thorough domain knowledge. To address this limitation, this paper introduces human-in-the-loop interaction approaches for representation debiasing of generated data involving domain experts. Our work advocates for a controlled data generation process involving domain experts to effectively mitigate the effects of representation bias. We argue that domain experts can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.