# Diagnostic accuracy of ChatGPT for 12-lead ECG-based localisation of ventricular ectopic foci prior to catheter ablation

**Authors:** Kadri Murat Gürses, Hüseyin Tezcan, Muhammed Ulvi Yalçın, Halil Özalp, Abdullah Tunçez, Yasin Özen

PMC · DOI: 10.3389/fmed.2025.1685419 · Frontiers in Medicine · 2026-01-12

## TL;DR

ChatGPT performed poorly in localizing ventricular ectopic foci from ECGs, showing no better accuracy than chance and underperforming existing methods.

## Contribution

First evaluation of ChatGPT's diagnostic accuracy for ECG-based localization of ventricular ectopic foci before ablation.

## Key findings

- ChatGPT correctly localized only 34% of cases, with a Cohen’s κ of −0.02, indicating no agreement beyond chance.
- Performance was poor for all anatomical origins, with no correct predictions for fascicular or epicardial foci.
- ChatGPT's accuracy did not improve with structural heart disease and had no impact on procedure duration or ablation success.

## Abstract

Precise pre-procedural localisation of ventricular ectopic (VE) foci shortens mapping time, reduces fluoroscopy, and improves ablation success. Large language models such as ChatGPT offer instant, free-text clinical support; however, their accuracy in ECG-based VE localisation is unknown.

In this single-centre pilot study, we assessed the diagnostic accuracy of ChatGPT in 50 consecutive adults (average age: 43 ± 14 years; 58% women) scheduled for first-time VE ablation. ChatGPT served as the index test, and invasive electroanatomical mapping during the ablation served as the reference standard. A blinded electrophysiologist converted each index 12-lead ECG into a structured textual description of QRS morphology. ChatGPT-4o (temperature 0.2) was then tasked with assigning one of five anatomical origins (RVOT, LVOT, papillary muscle, fascicular, and epicardial). Predictions were compared with electro-anatomical mapping during catheter ablation, and agreement was measured using Cohen’s κ (κ).

Electro-anatomical mapping identified 30 RVOT, 11 LVOT, 4 papillary, 1 fascicular, and 4 epicardial foci. ChatGPT correctly localised 17/50 cases (34%), yielding an overall Cohen’s κ of −0.02 (95% CI –0.18 to 0.14). Sensitivity/specificity was 40%/55% for the RVOT and 36%/62% for the LVOT; no fascicular or epicardial origins were correctly predicted. The performance of ChatGPT did not differ based on the presence of structural heart disease (p = 0.43). The duration of the procedure and the acute ablation success rate (96%) were unaffected by the accuracy of ChatGPT.

Freetext querying of ChatGPT failed to provide clinically meaningful VE localisation, performing no better than chance and markedly below published ECG-based algorithms. This likely reflects the model’s lack of domain-specific training and its reliance on purely text-based reasoning without direct access to ECG signals. Current general-purpose language models should not be relied upon for procedural planning in VE ablation; future work must integrate multimodal training and domain-specific optimisation before LLMs can augment electrophysiology practice.

## Full-text entities

- **Diseases:** heart disease (MESH:D006331), VE (MESH:D018879)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12833440/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12833440/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC12833440/full.md

---
Source: https://tomesphere.com/paper/PMC12833440