# Deciphering hierarchical regulatory network of cell fate via an epigenetics-informed heterogeneous graph transformer on single-cell multi-omics data

**Authors:** Yuhong Huang, Chao Liu, Zhiling Yang, Bo Liu, Xiao Zhai, Jiajin Zheng, Jing Xiao, Tao Song

PMC · DOI: 10.1093/bib/bbaf664 · Briefings in Bioinformatics · 2025-12-12

## TL;DR

This paper introduces SMOGT, a new method that uses graph transformers to model hierarchical regulatory networks in single-cell data, improving accuracy in predicting gene regulation and cell fate.

## Contribution

SMOGT integrates epigenetic mechanisms into a heterogeneous graph transformer to model hierarchical regulatory networks with higher accuracy.

## Key findings

- SMOGT outperforms existing methods in predicting transcriptional regulation and chromatin conformation.
- It identifies driver regulators and their target genes during cell fate transitions.
- SMOGT reveals critical regulatory hubs in diseases like melanoma and AML.

## Abstract

The precise control of cell fate is driven by a hierarchical regulatory network (HRNet) where transcription factors (TFs) and cis-regulatory elements (CREs) orchestrate the expression of target genes (TGs) through complex causal actions. While single-cell multi-omics technologies provide multi-dimensional data to resolve regulatory networks, existing methods often fail to capture their hierarchical and causal properties. We propose SMOGT (Single-cell Multi-Omics Graph Transformer), a graph representation learning method to decipher HRNet. SMOGT embeds epigenetic mechanism into Heterogeneous Graph Transformer (HGT) by structuring information flow along a hierarchical-guided meta-path (TF-TF → TF-CRE → CRE-CRE → CRE-TG), and employs a semi-supervised strategy to ensure network accuracy. Validated against ChIP-seq and HiC-seq benchmarked datasets, SMOGT showed significantly higher accuracy in predicting transcriptional regulation (TF-CRE) and long-range chromatin conformation (CRE-CRE). The HRNet scaffolds downstream modules that mechanistically link network architecture to cell fate. The multi-layer random walk (MRWR) module identifies driver regulators and their TGs. The BioStreamNet module predicts shifts in cell fate trajectories following in silico perturbations within gene-specific HRNet formed by extracting regulatory weights during TG expression prediction. In hematopoietic stem cell differentiation, SMOGT elucidated the hierarchical causal cascade from driver TFs that governs lineage commitment. In melanoma epithelial-to-mesenchymal transition (EMT), it revealed a critical therapeutic window for reversing the process, and in Acute Myeloid Leukemia (AML), it uncovered hub-CREs with significant prognostic value. By accurately modeling hierarchical causality, SMOGT provides a robust tool to dissect and predict cell fate dynamics in both development and disease.

## Linked entities

- **Diseases:** melanoma (MONDO:0005105), Acute Myeloid Leukemia (MONDO:0015667)

## Full-text entities

- **Genes:** DNMT3A (DNA methyltransferase 3 alpha) [NCBI Gene 1788] {aka DNMT3A2, HESJAS, M.HsaIIIA, TBRS}, FOXO1 (forkhead box O1) [NCBI Gene 2308] {aka FKH1, FKHR, FOXO1A}, NFATC2 (nuclear factor of activated T cells 2) [NCBI Gene 4773] {aka JCOSL, NFAT1, NFATP}, F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}, TTR (transthyretin) [NCBI Gene 7276] {aka AMYLD1, ATTR, CTS, CTS1, HEL111, HsT2651}, GATA1 (GATA binding protein 1) [NCBI Gene 2623] {aka CNSHA9, ERYF1, GATA-1, GF-1, GF1, HAEADA}, TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, TCF4 (transcription factor 4) [NCBI Gene 6925] {aka CDG2T, E2-2, FCD2, FECD3, ITF-2, ITF2}, POU3F1 (POU class 3 homeobox 1) [NCBI Gene 5453] {aka OCT6, OTF6, SCIP}, IRF8 (interferon regulatory factor 8) [NCBI Gene 3394] {aka H-ICSBP, ICSBP, ICSBP1, IMD32A, IMD32B, IRF-8}, FOS (Fos proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 2353] {aka AP-1, C-FOS, p55}, KLF1 (KLF transcription factor 1) [NCBI Gene 10661] {aka CDAN4A, CDAN4B, EKLF, EKLF/KLF1}, KIAA1958 (KIAA1958) [NCBI Gene 158405], DDX11 (DEAD/H-box helicase 11) [NCBI Gene 1663] {aka CHL1, CHLR1, KRG2, WABS}, NUP210 (nucleoporin 210) [NCBI Gene 23225] {aka GP210, POM210}, PDE3B (phosphodiesterase 3B) [NCBI Gene 5140] {aka HcGIP1, cGIPDE1}, RNF150 (ring finger protein 150) [NCBI Gene 57484], ERG (ETS transcription factor ERG) [NCBI Gene 2078] {aka LMPHM14, erg-3, p55}, GATA2 (GATA binding protein 2) [NCBI Gene 2624] {aka DCML, IMD21, MONOMAC, NFE1B}, CEBPB (CCAAT enhancer binding protein beta) [NCBI Gene 1051] {aka C/EBP-beta, IL6DBP, NF-IL6, TCF5}, SPNS3 (SPNS lysolipid transporter 3, sphingosine-1-phosphate (putative)) [NCBI Gene 201305] {aka SLC62A3, SLC63A3}, CALML3 (calmodulin like 3) [NCBI Gene 810] {aka CLP}, FUT1 (fucosyltransferase 1 (H blood group)) [NCBI Gene 2523] {aka H, HH, HSC}, CEBPA (CCAAT enhancer binding protein alpha) [NCBI Gene 1050] {aka C/EBP-alpha, CEBP}, POU2F2 (POU class 2 homeobox 2) [NCBI Gene 5452] {aka OCT2, OTF2, Oct-2}, NPM1 (nucleophosmin 1) [NCBI Gene 4869] {aka B23, NPM}, MYC (MYC proto-oncogene, bHLH transcription factor) [NCBI Gene 4609] {aka MRTL, MYCC, bHLHe39, c-Myc}, SQLE (squalene epoxidase) [NCBI Gene 6713], HS3ST3B1 (heparan sulfate-glucosamine 3-sulfotransferase 3B1) [NCBI Gene 9953] {aka 3-OST-3B, 3OST3B1, h3-OST-3B}, PLA2G4A (phospholipase A2 group IVA) [NCBI Gene 5321] {aka GURDP, PLA2G4, cPLA2, cPLA2-alpha}, SOX9 (SRY-box transcription factor 9) [NCBI Gene 6662] {aka CMD1, CMPD1, ENH13, SRA1, SRXX2, SRXY10}, SLC24A3 (solute carrier family 24 member 3) [NCBI Gene 57419] {aka NCKX3}, KLF4 (KLF transcription factor 4) [NCBI Gene 9314] {aka EZF, GKLF}, RUNX2 (RUNX family transcription factor 2) [NCBI Gene 860] {aka AML3, CBF-alpha-1, CBFA1, CCD, CCD1, CLCD}, MMRN1 (multimerin 1) [NCBI Gene 22915] {aka ECM, EMILIN4, GPIa*, MMRN}, NFIA (nuclear factor I A) [NCBI Gene 4774] {aka BRMUTD, C1DELp32p31, CTF, DEL1P32P31, NF-I/A, NF1-A}
- **Diseases:** BM (MESH:D001855), AML (MESH:D015470), Myeloid_CLP (MESH:D007951), HCG (MESH:C538388), TG (MESH:C537680), cancer (MESH:D009369), SE (MESH:C535318), melanoma (MESH:D008545), leukemia (MESH:D007938), HGT (MESH:D002472)
- **Chemicals:** GCN (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** GM12878 — Homo sapiens (Human), Transformed cell line (CVCL_7526), HCT116 — Homo sapiens (Human), Colon carcinoma, Cancer cell line (CVCL_0291), A549 — Homo sapiens (Human), Lung adenocarcinoma, Cancer cell line (CVCL_0023), K562 — Homo sapiens (Human), Blast phase chronic myelogenous leukemia, BCR-ABL1 positive, Cancer cell line (CVCL_0004)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12875533/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12875533/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/PMC12875533/full.md

---
Source: https://tomesphere.com/paper/PMC12875533