# External validation of a prediction model for disability and pain after lumbar disc herniation surgery: a prospective international registry-based cohort study

**Authors:** Allan ABBOTT, Casper Friis PEDERSEN, Henrik HEDEVIK, Catharina PARAI, Martin A GOROSITO, Mikkel ANDERSEN, Tor INGEBRIGTSEN, Tore K SOLBERG, Margreth GROTLE, Bjørnar BERG

PMC · DOI: 10.2340/17453674.2025.44251 · 2025-07-07

## TL;DR

This study validated machine learning models predicting disability and pain after lumbar disc surgery in Norwegian, Swedish, and Danish patients, showing consistent performance across countries.

## Contribution

The study externally validated existing machine learning models in new international cohorts, confirming their reproducibility and potential clinical utility.

## Key findings

- The models showed acceptable discrimination (C-statistic 0.70–0.81) for predicting disability and pain outcomes.
- Calibration slopes were close to 1, indicating good model calibration across the three cohorts.
- Decision curve analyses confirmed the models provided clear net benefit over treating all or no patients.

## Abstract

We aimed to externally validate machine learning models developed in Norway by evaluating their predictive outcome of disability and pain 12 months after lumbar disc herniation surgery in a Swedish and Danish cohort.

Data was extracted for patients undergoing microdiscectomy or open discectomy for lumbar disc herniation in the NORspine, SweSpine and DaneSpine national registries. Outcome of interest was changes in Oswestry disability index (ODI) (≥ 22 points), Numeric Rating Scale (NRS) for back pain (≥ 2 points), and NRS for leg pain (≥ 4 points). Model performance was evaluated by discrimination (C-statistic), calibration, overall fit, and net benefit.

For the ODI model, the NORspine cohort included 22,529 patients, the SweSpine cohort included 10,129 patients, and DaneSpine 5,670 patients. The ODI model’s C-statistic varied between 0.76 and 0.81 and calibration slope point estimates varied between 0.84 and 0.99. The C-statistic for NRS back pain varied between 0.70 and 0.76, and calibration slopes varied between 0.79 and 1.03. The C-statistic for NRS leg pain varied between 0.71 and 0.74, and calibration slopes varied between 0.90 and 1.02. There was acceptable overall fit and calibration metrics with minor–modest but explainable heterogeneity observed in the calibration plots. Decision curve analyses displayed clear potential net benefit in treatment in accordance with the prediction models compared with treating all patients or none.

Predictive performance of machine learning models for treatment success/non-success in disability and pain at 12 months post-surgery for lumbar disc herniation showed acceptable discrimination ability, calibration, overall fit, and net benefit reproducible in similar international contexts. Future clinical impact studies are required.

## Full-text entities

- **Diseases:** lumbar disc herniation (MESH:C535531), back pain (MESH:D001416), leg pain (MESH:D010146)
- **Chemicals:** DaneSpine (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12232440/full.md

---
Source: https://tomesphere.com/paper/PMC12232440