# Multi-view deep learning of highly multiplexed imaging data improves association of cell states with clinical outcomes

**Authors:** Shanza Ayub, Jennifer L Gorman, Edward L Y Chen, Hartland W Jackson, Alina Selega, Kieran R Campbell

PMC · DOI: 10.1093/bioadv/vbag010 · Bioinformatics Advances · 2026-01-14

## TL;DR

This paper introduces a deep learning method to better understand cell states using multiple types of data from imaging, improving predictions of clinical outcomes.

## Contribution

The novel contribution is a multi-modal variational autoencoder that integrates multiple imaging data views to better associate cell states with clinical outcomes.

## Key findings

- The integrated multi-modal latent space is more associated with clinical outcomes than existing baselines.
- Ablation analyses reveal which input views most contribute to model performance.
- The method enables cellular representations that align with phenotypes and integrate diverse datasets.

## Abstract

Analysis workflows for highly multiplexed imaging technologies typically summarize each cell in terms of its post-segmentation mean expression, but additional cellular information can be quantified including cell morphology, sub-cellular expression patterns, and spatial cellular context, ultimately giving a multi-modal view of each cell. While deep learning models such as variational autoencoders are well-established for other multi-modal single-cell assays, their ability to integrate these multiple views of a cell from highly multiplexed imaging data remains largely unknown.

Here, we explore the abilities of multi-modal variational autoencoders to learn unified latent cellular representations from multiple views of each single-cell quantified from highly multiplexed imaging, including mean expression, morphology, sub-cellular protein co-localization, and spatial cellular context, while conditioning on technical and batch specific effects. We show that the integrated multi-modal latent space is often more associated with patient-specific clinical outcomes compared to a set of existing baselines. In addition, we perform ablation analyses to understand which input views contribute to model performance, and explore the ability of these models to learn cellular representations that align with cellular phenotypes and enable integration across divergent datasets.

hmiVAE is implemented as a python package and is available at https://github.com/camlab-bioml/hmiVAE

## Full-text entities

- **Genes:** PECAM1 (platelet and endothelial cell adhesion molecule 1) [NCBI Gene 5175] {aka CD31, CD31/EndoCAM, GPIIA', PECA1, PECAM-1, endoCAM}, VIM (vimentin) [NCBI Gene 7431], KRT19 (keratin 19) [NCBI Gene 3880] {aka CK19, K19, K1CS}, KRT14 (keratin 14) [NCBI Gene 3861] {aka CK14, EBS1, EBS1A, EBS1B, EBS1C, EBS1D}, KRT7 (keratin 7) [NCBI Gene 3855] {aka CK7, K2C7, K7, SCL}, KRT20 (keratin 20) [NCBI Gene 54474] {aka CD20, CK-20, CK20, K20, KRT21}, PTPRC (protein tyrosine phosphatase receptor type C) [NCBI Gene 5788] {aka B220, CD45, CD45R, GP180, IMD105, L-CA}, GATA3 (GATA binding protein 3) [NCBI Gene 2625] {aka HDR, HDRS}, KRT5 (keratin 5) [NCBI Gene 3852] {aka CK5, DDD, DDD1, EBS1, EBS2, EBS2A}, EREG (epiregulin) [NCBI Gene 2069] {aka EPR, ER, Ep}, SMN1 (survival of motor neuron 1, telomeric) [NCBI Gene 6606] {aka BCD541, GEMIN1, SMA, SMA1, SMA2, SMA3}, GRHL3 (grainyhead like transcription factor 3) [NCBI Gene 57822] {aka SOM, TFCP2L4, VWS2}, BRAF (B-Raf proto-oncogene, serine/threonine kinase) [NCBI Gene 673] {aka B-RAF1, B-raf, BRAF-1, BRAF1, NS7, RAFB1}, IDO1 (indoleamine 2,3-dioxygenase 1) [NCBI Gene 3620] {aka IDO, IDO-1, INDO}, CA9 (carbonic anhydrase 9) [NCBI Gene 768] {aka CAIX, MN}, CD68 (CD68 molecule) [NCBI Gene 968] {aka GP110, LAMP4, SCARD1}, ERBB2 (erb-b2 receptor tyrosine kinase 2) [NCBI Gene 2064] {aka CD340, HER-2, HER-2/neu, HER2, MLN 19, MLN-19}, FOXP3 (forkhead box P3) [NCBI Gene 50943] {aka AIID, DIETER, IPEX, JM2, PIDX, XPID}, VWF (von Willebrand factor) [NCBI Gene 7450] {aka F8VWF, VWD}, NRAS (NRAS proto-oncogene, GTPase) [NCBI Gene 4893] {aka ALPS4, CMNS, N-ras, NCMS, NRAS1, NS6}, CD14 (CD14 molecule) [NCBI Gene 929], FN1 (fibronectin 1) [NCBI Gene 2335] {aka CIG, ED-B, FINC, FN, FNZ, GFND}, TOX (thymocyte selection associated high mobility group box) [NCBI Gene 9760] {aka TOX1}, SOX10 [NCBI Gene 101094609], CDH1 (cadherin 1) [NCBI Gene 999] {aka Arc-1, BCDS1, CD324, CDHE, ECAD, LCAM}, EPCAM (epithelial cell adhesion molecule) [NCBI Gene 4072] {aka Ber-Ep4, BerEp4, DIAR5, EGP-2, EGP314, EGP40}, FUT4 (fucosyltransferase 4) [NCBI Gene 2526] {aka CD15, ELFT, FCT3A, FUC-TIV, FUTIV, LeX}, ITGAM (integrin subunit alpha M) [NCBI Gene 3684] {aka CD11B, CR3A, HNA-4, MAC-1, MAC1A, MO1A}, MiTF [NCBI Gene 101091431], EGFR (epidermal growth factor receptor) [NCBI Gene 1956] {aka ERBB, ERBB1, ERRP, HER1, NISBD2, NNCIS}
- **Diseases:** cancer (MESH:D009369), PDAC (MESH:D010190), Hoch-Melanoma (MESH:D008545), hypoxia (MESH:D000860), hypoxic (MESH:D002534), invasive lobular carcinoma (MESH:D018275), invasive carcinoma (MESH:D009361), IMC (MESH:C564543), stage 4 or stage 3 melanoma (MESH:D062706), ARI (MESH:D000275), breast and melanoma cancers (MESH:D001943)
- **Chemicals:** EDTA (MESH:D004492), Tamoxifen (MESH:D013629), xylene (MESH:D014992), paraffin (MESH:D010232), metal (MESH:D008670), TBS (MESH:D013725), Iridium (MESH:D007495), water (MESH:D014867), TBS-T (-), Formalin (MESH:D005557), alcohol (MESH:D000438), PBS (MESH:D007854)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** Ali-BC — Homo sapiens (Human), EBV-related Burkitt lymphoma, Cancer cell line (CVCL_7172)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12955845/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12955845/full.md

## References

84 references — full list in the complete paper: https://tomesphere.com/paper/PMC12955845/full.md

---
Source: https://tomesphere.com/paper/PMC12955845