# SpaConTDS: A multimodal contrastive learning framework for identifying spatial domains by applying tuple disturbing strategy

**Authors:** Ruiwen Xu, Xiaoqing Cheng, Waiki Ching, Siyao Wu, Yuanben Zhang, Yidan Zhang

PMC · DOI: 10.1371/journal.pcbi.1013893 · 2026-01-29

## TL;DR

SpaConTDS is a new framework that uses multimodal contrastive learning and reinforcement learning to improve the identification of spatial domains in spatial transcriptomics data.

## Contribution

SpaConTDS introduces a novel tuple disturbing strategy and reinforcement learning for multimodal contrastive learning in spatial domain identification.

## Key findings

- SpaConTDS achieves state-of-the-art accuracy in spatial domain identification.
- The model outperforms existing methods in downstream tasks like denoising and UMAP visualization.
- SpaConTDS effectively integrates multiple tissue sections and corrects batch effects without alignment.

## Abstract

The rational utilization of multimodal spatial transcriptomics (ST) data enables accurate identification of spatial domains, which is essential for investigating cellular structure and functions. In this study, we proposed SpaConTDS, a novel framework that integrates reinforcement learning with self-supervised multimodal contrastive learning. SpaConTDS generates positive and negative samples through data augmentation and a pseudo-label tuple perturbation strategy, enabling the learning of fused representations that capture global semantics and cross-modal interactions. The model’s hyper-parameters are dynamically optimized using reinforcement learning. Extensive experiments across various resolutions and platforms demonstrate that SpaConTDS achieves state-of-the-art accuracy in spatial domain identification and outperforms existing methods in downstream tasks such as denoising, trajectory inference, and UMAP visualization. Moreover, SpaConTDS effectively integrates multiple tissue sections and corrects batch effects without requiring prior alignment. Compared to existing approaches, SpaConTDS offers more robust fused representations of multimodal data, providing researchers with a flexible and powerful tool for a wide range of spatial transcriptomics analyses.

Advancements in ST technologies have enabled researchers to simultaneously capture histological features, gene expression profiles, and spatial information. Unsupervised clustering of captured spots into spatial domains constitutes a fundamental component of spatial transcriptomics analysis, with the objective of delineating spatially coherent regions that are typically associated with distinct biological functions or tissue architectures. In this paper, we proposed a multi-modal spatial domain identification model called SpaConTDS. The innovations of SpaConTDS can be summarized as follows. Firstly, it utilizes a self-supervised multimodal contrastive learning method to effectively integrate gene expression and histopathological image information for spatial domain identification and alignment-free slice integration. Secondly, SpaConTDS uses reinforcement learning and global positive/negative sample construction strategies to adaptively capture fused representations that encompass interactions between modalities, which ensures that weak modalities are not neglected while avoiding the introduction of noise from image information. Thirdly, for multi-slice integration, negative samples and positive samples derived from the global similarity matrix can cover all slices, enabling SpaConTDS to automatically smooth the features of adjacent points both within and across slices without the need for slice alignment, thus learning more comprehensive cross-slice information and alleviating batch effects. The numerical results demonstrate that SpaConTDS exhibits superiority over existing methods in both spatial domain identification and integrated analysis on multiple slices. Moreover, the learned representations are applicable to various downstream tasks, including trajectory inference, gene expression denoising and uniform manifold approximation and projection (UMAP) visualization.

## Full-text entities

- **Genes:** MUC1 (mucin 1, cell surface associated) [NCBI Gene 4582] {aka ADMCKD, ADMCKD1, ADTKD2, CA 15-3, CD227, Ca15-3}, DUSP23 (dual specificity phosphatase 23) [NCBI Gene 54935] {aka DUSP25, LDP-3, LDP3, MOSP, VHZ}, ak1 (adenylate kinase 1) [NCBI Gene 445486] {aka zgc:91930}, EFHD1 (EF-hand domain family member D1) [NCBI Gene 80303] {aka MST133, MSTP133, PP3051, SWS2}, SHISA2 (shisa family member 2) [NCBI Gene 387914] {aka C13orf13, PRO28631, TMEM46, WGAR9166, bA398O19.2, hShisa}, TCEAL4 (transcription elongation factor A like 4) [NCBI Gene 79921] {aka NPD017, WEX7}, b2m (beta-2-microglobulin) [NCBI Gene 30400] {aka bwm, wu:fa94c05, wu:fb10a09}, EIF3H (eukaryotic translation initiation factor 3 subunit H) [NCBI Gene 8667] {aka EIF3S3, eIF3-gamma, eIF3-p40}, CTTN (cortactin) [NCBI Gene 2017] {aka EMS1}, CXCL14 (C-X-C motif chemokine ligand 14) [NCBI Gene 9547] {aka BMAC, BRAK, KEC, KS1, MIP-2g, MIP2G}, UBE2S (ubiquitin conjugating enzyme E2 S) [NCBI Gene 27338] {aka E2-EPF, E2EPF, EPF5}, SERHL2 (serine hydrolase like 2) [NCBI Gene 253190] {aka dJ222E13.1}, COX7C (cytochrome c oxidase subunit 7C) [NCBI Gene 1350], ckma (creatine kinase, muscle a) [NCBI Gene 30095] {aka CK-M, cb51, ckm, mck, wu:fa28d05}, SERPINA3 (serpin family A member 3) [NCBI Gene 12] {aka AACT, ACT, GIG24, GIG25}, CCND1 (cyclin D1) [NCBI Gene 595] {aka BCL1, D11S287E, PRAD1, U21B31}, CRISP3 (cysteine rich secretory protein 3) [NCBI Gene 10321] {aka Aeg2, CRISP-3, CRS3, SGP28, dJ442L6.3}, ERBB2 (erb-b2 receptor tyrosine kinase 2) [NCBI Gene 2064] {aka CD340, HER-2, HER-2/neu, HER2, MLN 19, MLN-19}, PTN (pleiotrophin) [NCBI Gene 5764] {aka HARP, HB-GAM, HBBM, HBGF-8, HBGF8, HBNF}, NUPR1 (nuclear protein 1, transcriptional regulator) [NCBI Gene 26471] {aka COM1, P8}, hmgb2a (high mobility group box 2a) [NCBI Gene 641484] {aka hmgb2, wu:fb22b10, wu:fc95d12, zgc:123215}, atp2a1l (ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 1, like) [NCBI Gene 494489] {aka im:7145237, si:ch211-198p7.3, zgc:154094}
- **Diseases:** CL (MESH:D007859), inflammation (MESH:D007249), Breast cancer (MESH:D001943), Ductal Carcinoma (MESH:D044584), ARI (MESH:D000275), muscle (MESH:D019042), ST (MESH:D008569), cancer (MESH:D009369), invasive carcinoma (MESH:D009361), DCIS (MESH:D002285), Melanoma (MESH:D008545), papillary carcinoma (MESH:D002291)
- **Chemicals:** hematoxylin (MESH:D006416), eosin (MESH:D004801), H&amp;E (MESH:D006371), PAGA (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090], Danio rerio (leopard danio, species) [taxon 7955]

## Figures

50 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12854462/full.md

---
Source: https://tomesphere.com/paper/PMC12854462