# Validation of an Artificial Intelligence Model for Breast Cancer Molecular Subtyping Using Hematoxylin and Eosin-Stained Whole-Slide Images in a Population-Based Cohort

**Authors:** Umay Kiraz, Claudio Fernandez-Martin, Emma Rewcastle, Einar G. Gudlaugsson, Ivar Skaland, Valery Naranjo, Sandra Morales-Martinez, Emiel A. M. Janssen

PMC · DOI: 10.3390/cancers17193234 · Cancers · 2025-10-05

## TL;DR

This study shows that an AI model can accurately predict breast cancer subtypes using standard H&E-stained slides, offering a faster and more accessible alternative to traditional methods.

## Contribution

The study validates an AI model for breast cancer molecular subtyping using H&E-stained whole-slide images, demonstrating its potential as a cost-effective and accessible diagnostic tool.

## Key findings

- The AI model achieved strong performance in distinguishing triple-negative breast cancer from non-triple-negative cases.
- Performance declined with an increasing number of molecular subtypes, indicating a need for further optimization.
- The model offers a promising alternative to IHC and gene expression profiling for breast cancer subtyping.

## Abstract

Breast cancer is a complex disease that can be classified into different biological subtypes. Correctly identifying these subtypes is essential in determining the most effective treatment for each patient. However, current methods such as gene expression testing and immunohistochemistry are either expensive, time-consuming, or not widely available in all healthcare settings. In this study, we explored whether a computer-based approach using artificial intelligence can accurately predict breast cancer subtypes by analyzing routine pathology slides stained with hematoxylin and eosin. This real-world validation study shows that this method can identify certain subtypes with promising accuracy, offering a faster and more accessible alternative to existing techniques. This research may help improve diagnostic processes, especially in hospitals with limited resources, and support more personalized treatment decisions for patients with breast cancer.

Background/Objectives: Breast cancer (BC) is the most commonly diagnosed cancer in women and the leading cause of cancer-related deaths globally. Molecular subtyping is crucial for prognosis and treatment planning, with immunohistochemistry (IHC) being the most commonly used method. However, IHC has limitations, including observer variability, a lack of standardization, and a lack of reproducibility. Gene expression profiling is considered the ground truth for molecular subtyping; unfortunately, this is expensive and inaccessible to many institutions. This study investigates the potential of an artificial intelligence (AI) model to predict BC molecular subtypes directly from hematoxylin and eosin (H&E)-stained whole-slide images (WSIs). Methods: A pretrained deep learning framework based on multiple-instance learning (MIL) was validated on the Stavanger Breast Cancer (SBC) dataset, consisting of 538 BC cases. Three classification tasks were assessed, including two-class [triple negative BC (TNBC) vs. non-TNBC], three-class (luminal vs. HER2-positive vs. TNBC), and four-class (luminal A vs. luminal B vs. HER2-positive vs. TNBC) groups. Performance metrics were used for the evaluation of the AI model. Results: The AI model demonstrated strong performance in distinguishing TNBC from non-TNBC (AUC = 0.823, accuracy = 0.833, F1-score = 0.824). However, performance declined with an increasing number of classes. Conclusions: The study highlights the potential of AI in BC molecular subtyping from H&E WSIs, offering an easily applicable and standardized method to IHC. Future improvements should focus on optimizing multi-class classification and validating AI-based methods against gene expression analyses for enhanced clinical applicability.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Genes:** ERBB2 (erb-b2 receptor tyrosine kinase 2) [NCBI Gene 2064] {aka CD340, HER-2, HER-2/neu, HER2, MLN 19, MLN-19}
- **Diseases:** cancer (MESH:D009369), BC (MESH:D001943), triple negative BC (MESH:D064726)
- **Chemicals:** H&amp;E (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12523546/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12523546/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/PMC12523546/full.md

---
Source: https://tomesphere.com/paper/PMC12523546