# Advantage of grading classification using volumetric artificial intelligence for periventricular hyperintensity and deep subcortical white matter hyperintensity

**Authors:** Masashi Kuwabara, Fusao Ikawa, Shinji Nakazawa, Saori Koshino, Daizo Ishii, Hiroshi Kondo, Takeshi Hara, Shingo Matsuda, Yuyo Maeda, Shiyuki Maeyama, Yoshinobu Seo, Jinichi Sasanuma, Kimito Kondo, Nobutaka Horie

PMC · DOI: 10.1038/s41598-025-23859-2 · Scientific Reports · 2025-11-17

## TL;DR

This study shows that an AI algorithm can accurately classify brain MRI features called PVH and DWMH, performing as well or better than human experts.

## Contribution

A new AI algorithm was developed and validated for automated grading of PVH and DWMH with high accuracy and consistency.

## Key findings

- The AI achieved higher multi-class accuracy in PVH classification than human experts.
- For DWMH, the AI outperformed experts in distinguishing between specific Fazekas grades.
- The AI showed good agreement with human raters and lower variability in volume ratio distribution.

## Abstract

We developed and validated an artificial intelligence (AI) algorithm for the automated grading of periventricular hyperintensity (PVH) and deep subcortical white matter hyperintensity (DWMH) using magnetic resonance imaging. Overall, 246 patients were evaluated, with 137 and 109 allocated to the training and testing groups, respectively. AI-predicted grading according to the Fazekas scale was compared with expert assessments using accuracy, F1-score, and mean absolute error. Inter-rater agreement was evaluated using Fleiss’ kappa to assess consistency among human raters and Cohen’s kappa to measure agreement between the AI and individual human raters. The AI demonstrated superior multi-class accuracy in PVH classification compared with the human expert, achieving an accuracy of 0.798 versus 0.743. In DWMH classification, the AI outperformed the expert specifically in distinguishing Fazekas 0/1/2 from the 3 classification, achieving an accuracy of 0.954 compared with the expert’s 0.927. Inter-rater agreement analysis showed that for PVH and DWMH, the AI achieved “good agreement” with human raters. For PVH, the AI’s agreement exceeded the human inter-rater agreement. The developed AI also exhibited lower variability in volume ratio distribution within the same grade compared with human raters. The developed AI algorithm effectively distinguished between PVH and DWMH, achieving accuracy comparable to human performance.

The online version contains supplementary material available at 10.1038/s41598-025-23859-2.

## Full-text entities

- **Diseases:** PVH (MESH:D054091), DWMH (MESH:D056784)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12624063/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12624063/full.md

---
Source: https://tomesphere.com/paper/PMC12624063