Stratifying risk of disease in haematuria patients using machine learning techniques to improve diagnostics
Anna Drożdż, Brian Duggan, Mark W. Ruddock, Cherith N. Reid, Mary Jo Kurth, Joanne Watt, Allister Irvine, John Lamont, Peter Fitzgerald, Declan O’Rourke, David Curry, Mark Evans, Ruth Boyd, Jose Sousa

TL;DR
This study uses machine learning to classify haematuria patients into healthy or sick groups and identifies key biomarkers for better diagnosis.
Contribution
The study introduces the CACTUS algorithm as a robust method for classifying haematuria patients in unbalanced datasets and identifies gender-specific biomarkers.
Findings
CACTUS algorithm achieved balanced accuracy of 0.747 for both genders in classifying haematuria patients.
Microalbumin, male gender, and tPSA were identified as the most informative biomarkers for the whole dataset.
Gender-specific biomarkers like tPSA and cystatin C for males and IL-8 for females were found significant.
Abstract
Detailed and invasive clinical investigations are required to identify the causes of haematuria. Highly unbalanced patient population (predominantly male) and a wide range of potential causes make the ability to correctly classify patients and identify patient-specific biomarkers a major challenge. Studies have shown that it is possible to improve the diagnosis using multi-marker analysis, even in unbalanced datasets, by applying advanced analytical methods. Here, we applied several machine learning algorithms to classify patients from the haematuria patient cohort (HaBio) by analysing multiple biomarkers and to identify the most relevant ones. We applied several classification and feature selection methods (k-means clustering, decision trees, random forest with LIME explainer and CACTUS algorithm) to stratify patients into two groups: healthy (with no clear cause of haematuria) or…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlood donation and transfusion practices · Chronic Kidney Disease and Diabetes · Renal Diseases and Glomerulopathies
