# Histo-Miner: Deep learning based tissue features extraction pipeline from H&E whole slide images of cutaneous squamous cell carcinoma

**Authors:** Lucas Sancéré, Carina Lorenz, Doris Helbig, Oana-Diana Persa, Sonja Dengler, Alexander Kreuter, Martim Laimer, Roland Lang, Anne Fröhlich, Jennifer Landsberg, Johannes Brägelmann, Katarzyna Bozek, Stacey D. Finley, Stacey D. Finley, Stacey D. Finley

PMC · DOI: 10.1371/journal.pcbi.1013907 · 2026-01-21

## TL;DR

Histo-Miner is a deep learning pipeline for analyzing skin cancer tissue images, extracting features that predict patient response to immunotherapy.

## Contribution

Histo-Miner introduces a novel deep learning pipeline and two new annotated datasets for cutaneous squamous cell carcinoma analysis.

## Key findings

- Histo-Miner achieves strong performance in nucleus segmentation, classification, and tumor region segmentation.
- The pipeline identifies immune cell features predictive of immunotherapy response in skin cancer patients.
- Histo-Miner's design allows for application to other cancer types and datasets.

## Abstract

Recent advances in digital pathology have enabled comprehensive analyses of Whole-Slide Images (WSIs) from tissue samples, leveraging high-resolution microscopy and computational capabilities. Despite this progress, available tools for automatic cell type identification perform poorly on skin tissue, e.g. in the classification of non-melanoma tumor cells. This is due to a paucity of labeled training data sets and high morphological similarities between tumor and non-tumor epithelial cells in the skin. Here, we propose Histo-Miner, a deep learning-based pipeline designed for the analysis of skin WSIs. To this end we generated two new datasets using WSIs of cutaneous Squamous Cell Carcinoma (cSCC) samples, a frequent non-melanoma skin cancer, by annotating 47,392 cell nuclei across 5 cell types in 21 WSIs and segmenting tumor regions in 144 WSIs. Histo-Miner employs convolutional neural networks and vision transformers for nucleus segmentation and classification, as well as tumor region segmentation. Performance of trained models positively compares to state of the art with multi-class Panoptic Quality (mPQ) of 0.569 for nucleus segmentation, macro-averaged F1 of 0.832 for nucleus classification and mean Intersection over Union (mIoU) of 0.907 for tumor region segmentation. From these output, the pipeline can generate a compact feature vector summarizing tissue morphology and cellular interactions, which can be used for various downstream tasks. As an exemplary use-case, we deploy Histo-Miner to predict cSCC patient response to immunotherapy based on pre-treatment WSIs from 45 patients. Histo-Miner predicts patient response with mean area under ROC curve of 0.755 ± 0.091 over cross-validation, and identifies percentages of lymphocytes, the granulocyte to lymphocyte ratio in tumor vicinity and the distances between granulocytes and plasma cells in tumors as predictive features for therapy response. This highlights the applicability of Histo-Miner to clinically relevant scenarios, providing direct interpretation of the classification and insights into the underlying biology. Importantly, Histo-Miner is designed to allow for its use on other cancer types and on other training datasets. Our tool and datasets are available through our github repository: https://github.com/bozeklab/histo-miner.

Digital pathology is transforming how we study disease by turning tissue samples into high-resolution images that capture the architecture of entire tumors. However, these images are vast and complex, making it difficult to extract meaningful clinical insights without advanced computational tools. In this work, we present Histo-Miner, a framework designed to systematically analyze these images at multiple levels of detail—from one single cell to entire tissue regions. We apply this approach to cutaneous squamous cell carcinoma, a common form of skin cancer, demonstrating how large-scale tissue data can be mined for biological insights. Our method identifies and characterizes different types of cells, maps how they are organized within tumor areas, and connects these patterns to patient outcomes. Through this lens, we uncover subtle features of the tissue environment that may influence how patients respond to therapy. We find that the most informative features describe the presence and balance of different types of immune system cells, and how these cells are spatially arranged within the tissue. Beyond its immediate findings, Histo-Miner, provides openly available data and tools that aim to make large-scale tissue analysis more interpretable, reproducible, and transferable to other diseases.

## Linked entities

- **Diseases:** cutaneous squamous cell carcinoma (MONDO:0002529), skin cancer (MONDO:0002898)

## Full-text entities

- **Genes:** PDCD1 (programmed cell death 1) [NCBI Gene 5133] {aka ADMIO4, AIMTBS, CD279, PD-1, PD1, SLEB2}, MPO (myeloperoxidase) [NCBI Gene 4353], CD79A (CD79a molecule) [NCBI Gene 973] {aka IGA, IGAlpha, MB-1, MB1}, IL9 (interleukin 9) [NCBI Gene 3578] {aka HP40, IL-9, P40}, FUT4 (fucosyltransferase 4) [NCBI Gene 2526] {aka CD15, ELFT, FCT3A, FUC-TIV, FUTIV, LeX}, MME (membrane metalloendopeptidase) [NCBI Gene 4311] {aka CALLA, CD10, CMT2T, NEP, SCA43, SFE}
- **Diseases:** breast cancer (MESH:D001943), PD (MESH:D018450), metastases (MESH:D009362), cSCC skin cancer (MESH:D018307), non (MESH:C580335), Cancer (MESH:D009369), skin WSIs (MESH:C564543), melanoma (MESH:D008545), Cutaneous Squamous Cell Carcinoma (MESH:D002294), CPI (MESH:C565433), non-melanoma skin cancer (MESH:D012878)
- **Chemicals:** Hematoxylin (MESH:D006416), Eosin (MESH:D004801), H&amp;E (MESH:D006371), SCC (MESH:C007020), Anita Estes (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090], Sarcocystis entzerothi (species) [taxon 1913021]
- **Cell lines:** SCC — Siniperca chuatsi (Mandarin fish), Spontaneously immortalized cell line (CVCL_C0WY), CellViT-SAM-B — Homo sapiens (Human), Blast phase chronic myelogenous leukemia, BCR-ABL1 positive, Cancer cell line (CVCL_8440), H&amp;E — Homo sapiens (Human), Transformed cell line (CVCL_ZD53)

## Figures

50 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12854473/full.md

---
Source: https://tomesphere.com/paper/PMC12854473