Towards Assessing Data Bias in Clinical Trials
Chiara Criscuolo, Tommaso Dolci, Mattia Salnitri

TL;DR
This paper introduces a comprehensive method to identify, quantify, and mitigate data bias in clinical trial datasets, aiming to improve the accuracy and fairness of health care research analyses.
Contribution
It defines types of data bias, develops metrics for their measurement, and offers guidelines applicable to various clinical trial data sources.
Findings
The proposed method effectively characterizes data bias.
Guidelines assist researchers in bias mitigation.
Evaluation includes theoretical analysis and expert interviews.
Abstract
Algorithms and technologies are essential tools that pervade all aspects of our daily lives. In the last decades, health care research benefited from new computer-based recruiting methods, the use of federated architectures for data storage, the introduction of innovative analyses of datasets, and so on. Nevertheless, health care datasets can still be affected by data bias. Due to data bias, they provide a distorted view of reality, leading to wrong analysis results and, consequently, decisions. For example, in a clinical trial that studied the risk of cardiovascular diseases, predictions were wrong due to the lack of data on ethnic minorities. It is, therefore, of paramount importance for researchers to acknowledge data bias that may be present in the datasets they use, eventually adopt techniques to mitigate them and control if and how analyses results are impacted. This paper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)
