# Identifying recurrent stone formers with machine learning: A single‐centre observational study

**Authors:** Pedro Amado, Daniel G. Fuster, Matteo Bargagli, Dominik Obrist, Fiona Burkhard, Beat Roth, Francesco Clavica, Shaokai Zheng

PMC · DOI: 10.1002/bco2.70176 · 2026-03-06

## TL;DR

This study uses machine learning to identify patients likely to experience recurrent kidney stones, aiming to improve early detection and treatment strategies.

## Contribution

The novel contribution is applying machine learning to routinely collected clinical data to predict recurrent kidney stone formers more accurately than prior methods.

## Key findings

- Machine learning models achieved a mean AUC of 0.71 in predicting recurrent kidney stone events.
- Estimated glomerular filtration rate, age at first stone episode, oxalate, and pH were key predictive features.
- Median imputation outperformed other data imputation techniques in model performance.

## Abstract

Kidney stones affect 12% of the population over their lifetime. Recurrent kidney stones lead to repeated interventions and excessive healthcare costs. Despite progress in imaging and metabolic evaluations, models to accurately identify patients at high risk are missing. In this study, we investigate whether machine learning methods can facilitate early identification of recurrent kidney stone formers.

This observational study included data from the single‐centric Bern Kidney Stone Registry. Each participant had at least one stone episode. Different data imputation techniques, such as kernel density estimation (KDE) imputation, median imputation and k‐nearest neighbour (KNN) imputation, were evaluated in a logistic regression model. Feature selection with recursive feature elimination was applied. A fivefold cross‐validation was conducted using an 80/20 split. The classification criterion was recurrent kidney stone event.

A total of 706 patients (median age, 47, 71.2% male) were included, and 563 (79.7%) had recurrent stone events. The median imputation yielded the best‐performing models. A mean receiver operating characteristic curve area under the curve (AUC) of 0.71 ± 0.03 was achieved on the held‐out test set. Estimated glomerular filtration rate (OR = 0.45, 95% CI: 0.42–0.49), age at first stone episode (OR = 0.50, 95% CI: 0.46–0.56), oxalate (OR = 1.83, 95% CI: 1.43–2.23) and pH (OR = 1.74, 95% CI: 1.47–1.89) were among the most descriptive features.

Routinely collected clinical and laboratory variables can be potentially exploited to identify recurrent stone formers, and our machine learning approach achieved better performance than previously reported work. With further validation on external datasets, our routine could support clinicians in designing dietary, medical or surveillance strategies, thereby reducing recurrence rates and improving long‐term outcomes for patients with stone‐forming conditions.

## Linked entities

- **Chemicals:** oxalate (PubChem CID 71081)

## Full-text entities

- **Diseases:** Hypertension (MESH:D006973), hyperoxaluria (MESH:D006959), cystine (MESH:D003554), Kidney Stone (MESH:D007669), infection (MESH:D007239), bowel disease (MESH:D015212), kidney damage (MESH:D007674), hypercalciuria (MESH:D053565), gout (MESH:D006073), pain (MESH:D010146), nephrolithiasis (MESH:D053040), diabetes (MESH:D003920), calcium oxalate stones (MESH:C563477), obesity (MESH:D009765), smoking (MESH:D015208), stone formation (MESH:D058426), metabolic abnormalities (MESH:D008659)
- **Chemicals:** Potassium (MESH:D011188), calcium oxalate stone (-), brushite (MESH:C494366), citrate (MESH:D019343), calcium (MESH:D002118), uric acid (MESH:D014527), oxalate (MESH:D010070), calcium oxalate (MESH:D002129), calcium phosphate (MESH:C020243)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** p.E161K, (AUC) of 0

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12966608/full.md

---
Source: https://tomesphere.com/paper/PMC12966608