# Self-supervised learning to predict intrahepatic cholangiocarcinoma transcriptomic classes on routine histology

**Authors:** Aurélie Beaufrère, Tristan Lazard, Rémy Nicolle, Gwladys Lubuela, Jérémy Augustin, Miguel Albuquerque, Baptiste Pichon, Camille Pignolet, Victoria Priori, Nathalie Théou-Anton, Mickael Lesurtel, Mohamed Bouattour, Kévin Mondet, Jérôme Cros, Julien Calderaro, Thomas Walter, Valérie Paradis

PMC · DOI: 10.1016/j.jhepr.2025.101675 · JHEP Reports · 2025-11-11

## TL;DR

This study develops a self-supervised learning model to predict transcriptomic classes of intrahepatic cholangiocarcinoma from routine histology slides, offering a practical alternative to costly molecular analyses.

## Contribution

A self-supervised learning model is introduced to predict iCCA transcriptomic classes from histology slides without manual annotations or large tissue samples.

## Key findings

- The model achieved AUCs of 0.63–0.84 for predicting the four most frequent transcriptomic classes in the discovery set.
- Validation on TCGA and French external sets showed AUCs of 0.76–0.80 and 0.62–0.92, respectively.
- The model performed particularly well for the hepatic stem-like class with an AUC of 0.84.

## Abstract

The transcriptomic classification of intrahepatic cholangiocarcinoma (iCCA) has recently been refined from two to five classes, each associated with pathological features, targetable genetic alterations, and survival outcomes. Despite its potential prognostic and therapeutic value, the transcriptomic classification is not routinely used in practice because of technical limitations, including insufficient tissue material and the high cost of molecular analyses. Here, we assessed a self-supervised learning (SSL) model for predicting iCCA transcriptomic classes on digitised whole-slide images (WSIs)

Transcriptomic classes defined from RNA sequencing data were available for all samples. The SSL method (Giga-SSL) was used to train our model on a discovery set of 766 WSIs from 137 biopsies and 109 surgical specimens obtained from 246 patients, using a five-fold cross-validation scheme. The model was validated in The Cancer Genome Atlas (TCGA) cohort (n = 29) and a French external validation set (n = 32), both using WSIs from surgical samples.

The most frequent transcriptomic class was the hepatic stem-like class (37% [90/246] in the discovery set). Our model showed good to very good performance in predicting the four most frequent transcriptomic classes in the discovery set (AUC 0.63-0.84), especially for the hepatic stem-like class (AUC 0.84). The model performed equally well in predicting these transcriptomic classes in the two validation sets, with AUCs ranging from 0.76 to 0.80 in the TCGA set and 0.62 to 0.92 in the French external set.

We developed and validated an SSL-based model capable of predicting iCCA transcriptomic classes from routine histological slides of both biopsy and surgical samples. This approach may facilitate the clinical implementation of transcriptomic classification, improve prognostic assessment, and guide therapeutic decision-making in iCCA.

Predicting transcriptomic classes directly from routine histological slides has the potential to enhance the clinical management of intrahepatic cholangiocarcinoma, enabling more accurate prognostication and supporting therapeutic decision-making. By eliminating the need for manual slide annotation, large tissue samples, or resource-intensive molecular analyses, our self-supervised learning-based model offers a practical and scalable solution that can be applied to both biopsy and surgical specimens. This approach could accelerate the adoption of transcriptomic classification in everyday practice and help guide more personalized treatment strategies for patients with intrahepatic cholangiocarcinoma.

Image 1

•Five transcriptomic classes of iCCA have been described, each associated with prognosis.•We developed and validated an SSL model for predicting these transcriptomic classes.•We obtained good performances for the prediction of the four most frequent classes.•Our model could be used on routine slides of biopsy or surgical specimens.•Our model could be used without the need for manual slide annotations.

Five transcriptomic classes of iCCA have been described, each associated with prognosis.

We developed and validated an SSL model for predicting these transcriptomic classes.

We obtained good performances for the prediction of the four most frequent classes.

Our model could be used on routine slides of biopsy or surgical specimens.

Our model could be used without the need for manual slide annotations.

## Linked entities

- **Diseases:** intrahepatic cholangiocarcinoma (MONDO:0003210), iCCA (MONDO:0011178)

## Full-text entities

- **Diseases:** iCCA (MESH:D018281), Cancer (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12800354/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12800354/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC12800354/full.md

---
Source: https://tomesphere.com/paper/PMC12800354