# Derivation and validation of a machine learning-driven score to predict the diagnostic yield of endomyocardial biopsy

**Authors:** Christian Basile, Christian L. Polte, Piero Gentile, Entela Bollano, Araz Rawshani, Anders Oldfors, Charlotta Ljungman, Sven-Erik Bartfay, Pia Dahlberg, Clara Hjalmarsson, Marie Björkenstam, Elena Gualini, Antonio Cannatá, Patrizia Pedrotti, Andrea Garascia, Gianluigi Savarese, Aldo Pietro Maggioni, Kristjan Karason, Emanuele Bobbio

PMC · DOI: 10.1038/s41746-026-02421-y · NPJ Digital Medicine · 2026-02-09

## TL;DR

This study creates a machine learning score to predict if a heart biopsy will provide a diagnosis, using non-invasive data to guide clinical decisions.

## Contribution

A novel machine learning score is developed and validated to predict diagnostic yield of heart biopsies using non-invasive patient data.

## Key findings

- Right ventricular late gadolinium enhancement was the strongest predictor of diagnostic yield.
- The model showed excellent discrimination with an AUC of 0.92 in cross-validation and 0.91 in testing.
- Amyloidosis was the most common diagnosis from biopsies, occurring in 50% of diagnostic cases.

## Abstract

Despite its low diagnostic yield, endomyocardial biopsy (EMB) remains the gold standard for establishing a definitive diagnosis in many cardiomyopathies. We developed and validated a machine-learning–based score to predict the likelihood of diagnostic EMB using non-invasive data. We retrospectively analyzed 775 heart failure patients who underwent EMB. A random forest algorithm was selected for score development based on superior discriminative performance. The model was externally validated in an independent cohort (n = 171). The study population was predominantly male (72.1%), with half of the patients in NYHA class III–IV. EMB yielded a definitive diagnosis in 19.9% of cases, most commonly amyloidosis (50%). A predictive score (0-100 range) was derived from key non-invasive predictors. Right ventricular late gadolinium enhancement (LGE) on cardiac magnetic resonance emerged as the strongest predictor, followed by left ventricular and atrial LGE, NTproBNP levels, and renal function. The model demonstrated excellent discrimination, with an area under the curve of 0.92 (95% CI = 0.89–0.96) in cross-validation and 0.91 (95% CI = 0.86–0.98) in the testing set, with consistent performance on external validation (AUC 0.82, 95% CI = 0.76–0.89). This machine-learning-based score may provide a non-invasive tool to support EMB decision-making in clinical practice.

## Linked entities

- **Diseases:** amyloidosis (MONDO:0019065)

## Full-text entities

- **Diseases:** heart failure (MESH:D006333), cardiomyopathies (MESH:D009202), amyloidosis (MESH:D000686)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12996545/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12996545/full.md

---
Source: https://tomesphere.com/paper/PMC12996545