# Real-world evaluation of an automated EEG spike detection software in a tertiary centre compared to a clinical reference standard

**Authors:** C. Cook, A. Auwal, S. Eglese, B. Hywel, M. A. Ellul, B. D. Michael

PMC · DOI: 10.1007/s00415-026-13636-0 · 2026-01-30

## TL;DR

This study evaluates an automated EEG spike detection software in real-world clinical settings, finding it effective at ruling out spikes but prone to false positives.

## Contribution

The study provides a real-world evaluation of an automated EEG spike detection model using a large, clinically representative dataset.

## Key findings

- The model had a high negative predictive value (96.3%) for ruling out IEDs.
- However, it had a low positive predictive value (19.9%), indicating many false positives.
- The study highlights the need for clinical feedback to improve model utility in practice.

## Abstract

Interictal epileptiform discharges (IEDs) are transient spikes or waves that occur in electroencephalography (EEG) records and can help support the diagnosis and classification of epilepsy. High-throughput machine learning models aim to automate the detection of IEDs. Previous evaluations of machine learning models have reported non-inferiority compared to human experts, but these studies predominantly use small datasets of pre-selected, ‘IED rich’ records, which are not representative of clinical practice. Therefore, this study aims to analyse the accuracy of machine learning models in a large, routine, clinically representative cohort.

All routine EEGs performed in a large regional hospital in England were identified between June 2024 and February 2025. EEG records were run through the commercial machine learning model P15 and automated IED reports generated. The sensitivity, specificity, positive and negative predictive value of P15-detected IEDs were evaluated using the final clinical report as a reference standard.

Of 484 EEG records, 53 were reported to contain at least one IED in the final clinical report. At P15’s default sensitivity setting, sensitivity for IED detection was 81.1% (95% CI:77.6–84.6), specificity 59.9% (95% CI: 55.5–64.2), positive predictive value 19.9% (95% CI:16.3–23.5) and negative predictive value 96.3% (95% CI:94.6–98.0).

This large-scale study of a machine learning model for identification of IEDs in a representative clinical population found a high negative predictive value suggesting that this may be a useful tool to rule out IEDs. However, the low positive predictive value demonstrates the potential for over-calling IEDs in routine EEGs. Future research should evaluate machine learning models alongside clinical feedback before this approach can have sufficient utility in direct clinical care.

## Linked entities

- **Diseases:** epilepsy (MONDO:0005027)

## Full-text entities

- **Diseases:** epileptiform activity (MESH:D014277), seizures (MESH:D012640), Epilepsy (MESH:D004827), IEDs (MESH:D019522), hyperventilation (MESH:D006985), EEG abnormalities (MESH:D000014)
- **Chemicals:** Persyst (-), spike (MESH:C010346)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12858492/full.md

---
Source: https://tomesphere.com/paper/PMC12858492