Explainable AI for Classifying UTI Risk Groups Using a Real-World Linked EHR and Pathology Lab Dataset
Yujie Dai, Brian Sullivan, Axel Montout, Amy Dillon, Chris Waller, Peter Acs, Rachel Denholm, Philip Williams, Alastair D Hay, Raul Santos-Rodriguez, Andrew Dowsey

TL;DR
This study develops an explainable AI framework using linked EHR and pathology data to classify UTI risk groups, addressing data challenges and providing interpretable insights for clinical decision support.
Contribution
It introduces a novel UTI risk estimation method with explainable AI techniques applied to a large real-world EHR dataset, emphasizing data fairness and interpretability.
Findings
Differences in predictors across risk groups
Explainable models identify key clinical features
Framework supports clinical decision-making
Abstract
The use of machine learning and AI on electronic health records (EHRs) holds substantial potential for clinical insight. However, this approach faces challenges due to data heterogeneity, sparsity, temporal misalignment, and limited labeled outcomes. In this context, we leverage a linked EHR dataset of approximately one million de-identified individuals from Bristol, North Somerset, and South Gloucestershire, UK, to characterize urinary tract infections (UTIs). We implemented a data pre-processing and curation pipeline that transforms the raw EHR data into a structured format suitable for developing predictive models focused on data fairness, accountability and transparency. Given the limited availability and biases of ground truth UTI outcomes, we introduce a UTI risk estimation framework informed by clinical expertise to estimate UTI risk across individual patient timelines. Pairwise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Artificial Intelligence in Healthcare and Education · Machine Learning in Healthcare
