Machine Learning-Based Pre-Test Risk Stratification for PCR-Confirmed Chlamydia Using Patient-Reported Data and Urine Biomarkers
Mehrab Mahdian, Marko Lehes, Katrin Krolov, Tamas Pardy

TL;DR
This study evaluates machine learning models using patient-reported data and urine biomarkers to predict Chlamydia infection risk, aiming to optimize screening efforts with non-invasive, routinely available data.
Contribution
It demonstrates that urine biomarkers and combined features improve pre-test risk stratification robustness and performance in Chlamydia screening.
Findings
Urine biomarkers provide reliable predictive signals for risk stratification.
Combining feature groups marginally increases AUC and reduces variability.
Ensemble models yield the strongest predictive performance.
Abstract
Early identification of individuals at elevated risk of Chlamydia trachomatis infection may enable optimal use of molecular testing in resource-aware screening. We evaluate the feasibility of pre-test risk stratification (PTRS) using machine-learning models trained on routinely available, non-invasive clinical data. A curated dataset of 93 urine samples with PCR reference labels was analyzed using three feature groups: patient-reported history and symptoms, urine biomarkers from standard urinalysis, and their combination. Five supervised classifiers were evaluated using stratified 5-fold cross-validation with out-of-fold probability estimates. Performance was assessed using area under the receiver operating characteristic curve (AUC) and threshold-dependent metrics, with uncertainty quantified via bootstrap confidence intervals. Models using only patient-reported data showed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
