CTAS: a network control theory-based approach to identify key regulatory TFs of AS events during epithelial–mesenchymal transition
Yan Gan, Yangsong He, Pu Zhao, Wai-Ki Ching, Yushan Qiu

TL;DR
This paper introduces CTAS, a new method to identify key transcription factors that control alternative splicing during epithelial–mesenchymal transition.
Contribution
CTAS uses network control theory to uncover multi-layered regulatory logic from bulk data, enabling the identification of TFs controlling EMT-related AS events.
Findings
CTAS reconstructs EMT trajectories with high accuracy (Spearman’s ρ = 0.99946) and infers directed networks with 89.9% ROC AUC.
In TCGA BRCA data, CTAS identifies HOXA3, PRDM8, and TWIST2 as top TF controllers of AS events during EMT.
Dynamic shifts in nine AS events were detected, with ZNF521 and HIC1 highlighted as candidate regulators in a CD44 subnetwork.
Abstract
Alternative splicing (AS) is a key driver of transcriptomic diversity and plays a pivotal role in epithelial–mesenchymal transition (EMT). During EMT, dynamic splicing changes contribute to cell plasticity and metastasis, yet the upstream regulatory logic remains unclear. Although transcription factors (TFs) are thought to influence AS programs, they typically act through RNA-binding proteins (RBPs), forming a hierarchical TF\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}RBP\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs}…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11|
|
|
|
|---|---|---|
| HOXA3 | 1.00 | [ |
| PRDM8 | 0.86 | [ |
| TWIST2 | 0.83 | [ |
|
|
|
|
|---|---|---|
| PRDM8 | 0.77 | [ |
|
|
|
|
|---|---|---|
| ZNF521 | 0.86 | [ |
| MAF | 0.46 | [ |
|
|
|
|
|---|---|---|
| HIC1 | 0.65 | [ |
| PRDM8 | 0.31 | [ |
- —National Natural Science Foundation of China10.13039/501100001809
- —Guangdong Basic and Applied Basic Research Foundation10.13039/501100021171
- —Shenzhen Science and Technology Program
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA Research and Splicing · Single-cell and spatial transcriptomics · Cancer-related molecular mechanisms research
Introduction
Tumor metastasis refers to the spread of cancer cells from the primary site to distant organs, where they form new lesions [1]. Among the multiple molecular and cellular mechanisms involved, epithelial–mesenchymal transition (EMT) is a key process that enables cancer cells to acquire migratory and invasive capabilities. EMT describes the transition of epithelial cells to a mesenchymal phenotype with reduced epithelial characteristics, and it plays critical roles in embryonic development, tissue repair, fibrosis, and tumor metastasis [2, 3]. While the transcriptional regulatory network of EMT has been extensively studied [4–6], the role of AS in this process remains incompletely understood.
AS generates structurally and functionally diverse mRNA and protein isoforms by selecting different splice sites in precursor mRNA [7–9]. AS is regulated by cis-acting elements and trans-acting splicing factors, primarily RNA-binding proteins (RBPs), which bind to specific RNA sequences and influence splice site recognition [10, 11]. Transcription factors (TFs) can modulate AS indirectly by regulating the transcription or activity of RBPs, thereby acting in concert with RBPs to control splicing outcomes in target genes [12, 13]. Systematically uncovering the dynamic regulatory relationships among AS, RBPs, and TFs during EMT is critical to understanding the molecular mechanisms underlying cancer progression.
Gene regulatory networks (GRNs) describe complex gene–gene interactions and have been widely used to analyze biological regulation. Existing GRN inference approaches, including Boolean networks [14–16], differential equation models [17, 18], Bayesian networks [19, 20], association networks [21], and machine-learning-based methods such as GENIE3 [22], which have advanced our understanding of transcriptional control. However, these approaches mainly characterize pairwise or single-layer regulatory relationships and thus cannot fully capture multi-layer dynamics across TFs, RBPs, and AS events. Moreover, most GRN studies identify hub regulators using network topology that reflects structure but not dynamic influence. In contrast, control-theoretic approaches view regulation as a dynamic system, revealing how perturbations such as gene knockouts can steer a network toward specific states. This perspective complements topological analysis and provides mechanistic insight into network controllability [23, 24].
For directed networks, control theory seeks the minimum set of driver nodes whose external inputs can steer the system toward a desired state. Linear or locally nonlinear control tools based on the maximum-matching set identify the smallest set of input nodes required to achieve controllability [25]. When the precise system equations are unknown, feedback-vertex-set (FVS) control schemes can ensure reliable nonlinear control [26, 27], and the extended FVS model jointly considers source and FVS nodes to reduce control cost [28]. For undirected networks, structural controllability can be assessed using the minimum dominating set [29], and recent nonlinear control frameworks such as NCUA [30] have further expanded its applicability. Together, these methods form the theoretical foundation for analyzing control in biomolecular networks.
Despite these advances, the interaction mechanisms among AS events, their regulatory RBPs, and upstream TFs during cancer-associated EMT remain largely unresolved. While several models have explored AS–RBP relationships, most rely on time-series data to infer dynamic regulation that is difficult to obtain in clinical contexts. In contrast, cross-sectional omics datasets are more accessible but lack explicit temporal information, limiting causal inference, and dynamic interpretation. Existing methods for such data mainly rely on co-expression or association analysis, which cannot uncover regulatory causality or reveal key drivers of phenotypic transitions.
Pseudotemporal ordering provides an alternative by reordering cross-sectional samples according to expression similarity to reconstruct latent trajectories of biological progression. This strategy has been effectively used to capture nonlinear cellular dynamics and infer regulatory changes over time. For example, the latent-temporal progression-based Bayesian method inferred GRNs using gene expression and pathological information [31], while the pseudotime causality-based Bayesian model identified dynamic relationships between AS events and RBPs during breast cancer EMT [32]. However, these approaches still lack the ability to integrate TFs into the regulatory hierarchy or quantify regulatory influence.
Building upon these foundations, we develop CTAS, a network control theory-based framework to identify key TFs regulating AS during EMT. CTAS integrates pseudotime ordering, trend analysis, sparse directed network inference, and control-theoretic screening into a unified framework. Using cross-sectional transcriptomic data, CTAS reconstructs dynamic trajectories, infers TF \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \rightarrow \end{document} RBP \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \rightarrow \end{document} AS regulatory cascades, and identifies TFs most capable of steering AS dynamics. Applied to simulated and real datasets [33], CTAS reveals hierarchical control mechanisms underlying EMT and identifies potential master TF regulators validated by biological evidence.
The major contributions of this study are as follows: (i) we present the first comprehensive investigation of the TF–RBP–AS regulatory hierarchy during EMT, constructing a dynamic network that reveals a TF \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \rightarrow \end{document} RBP \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \rightarrow \end{document} AS cascade; (ii) we employ pseudotime algorithms to reconstruct temporal trajectories from cross-sectional data, transforming static omics into dynamic insights; and (iii) we introduce a structure-based constrained target control (CTC) framework to identify key TFs regulating EMT-associated AS events, providing new clues for potential therapeutic targets in cancer.
Materials and methods
Problem definition and method outline
EMT is a hallmark of tumor invasion and metastasis. Dysregulated AS plays an essential regulatory role in EMT, and its aberrations can disrupt normal cellular function and promote tumor progression. AS events therefore hold potential as biomarkers and therapeutic targets in cancer. RBPs regulate AS by binding to cis-regulatory elements within introns and exons, whereas TFs modulate AS indirectly by regulating the expression or activity of RBPs.
Figure 1 illustrates the overall EMT process, highlighting the hierarchical regulation among AS events, RBPs, and TFs. Despite their significance, the dynamic regulatory relationships among these molecular components during EMT remain poorly understood. Moreover, obtaining time-series data for EMT is costly, and most cancer transcriptomic datasets lack explicit temporal information.
Schematic representation of EMT integrating AS events, RBPs, and TFs.
To address these limitations, we developed CTAS, a control-theory-based framework for constructing and analyzing regulatory networks from cross-sectional data. CTAS reconstructs dynamic expression trajectories via pseudotime analysis, identifies EMT-associated AS events, RBPs, and TFs through trend analysis, builds a dynamic TF–RBP–AS network using ordinary differential equations (ODEs) and mass-action kinetics, and applies CTC to pinpoint key TFs that regulate EMT-related AS events.
Datasets
We utilized TCGA BRCA Level 3 RNA-SeqV2 gene expression data from the Genomic Data Commons (GDC) Legacy Archive. Details of preprocessing are provided in Text S1. A total of 143 epithelial and 157 mesenchymal samples were selected. Rows with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \geq \end{document} 100 missing values were removed, and the remaining data were imputed using knnimpute. Rows with zero variance were excluded. The final processed datasets included a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 10,049\times 300\end{document} AS matrix (Table S1), a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 1525\times 300\end{document} RBP matrix (Table S2), and a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 1500\times 300\end{document} TF matrix (Table S3).
Overview of the CTAS framework
Figure 2 presents the CTAS workflow that comprises three main modules: (i) pseudotime and trend analysis, (ii) construction of the TF–RBP–AS regulatory network, and (iii) network control analysis to identify key TFs.
Overview of the CTAS framework.
In the first module, pseudotime analysis reorders cross-sectional samples to reconstruct latent temporal trajectories. The second module performs trend analysis to detect AS events, RBPs, and TFs exhibiting significant monotonic changes along pseudotime. In the third module, CTAS constructs an ODE-based network model under sparsity assumptions and estimates parameters via Bayesian Lasso regression. Finally, CTC identifies the TFs most capable of steering AS dynamics during EMT.
Pseudotime analysis
Pseudotime trajectories were inferred from cross-sectional data using similarity graph-based random walks (Text S2). Epithelial and mesenchymal samples were labeled as 1 and 2, respectively. Each sample was assigned a pseudotime score, enabling reconstruction of a smooth temporal trajectory. The pseudotime order derived from TF expression was subsequently applied to RBPs and AS events, ensuring consistent temporal alignment across molecular layers.
Trend analysis
Trend analysis was performed to identify EMT-related AS events, RBPs, and TFs (Text S3). For each molecular feature, we computed the ratio between its linear trend and detrended standard deviation along pseudotime, using the absolute value as its trend score. A higher score indicates greater temporal variation and stronger EMT association. AS events, RBPs, and TFs with the highest trend scores were selected for subsequent modeling.
Construction of the regulatory network
The interactions among TFs, RBPs, and AS events were modeled as a dynamic system governed by mass-action kinetics. The system of ODEs is defined as:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*} & \frac{\mathrm{d}X_{i}(s)}{\mathrm{d}s}=\sum_{j\neq i}a_{ij}X_{i}(s)X_{j}(s)+\sum_{l=1}^{M}b_{il}X_{i}(s)Y_{l}(s)-d_{i}X_{i}(s), \end{align*}\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*} & \frac{\mathrm{d}Y_{l}(s)}{\mathrm{d}s}=\sum_{k\neq l}c_{lk}Y_{l}(s)Y_{k}(s)+\sum_{p=1}^{H}e_{lp}Y_{l}(s)Z_{p}(s)-d_{l}^{\prime}Y_{l}(s),\!\!\! \end{align*}\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*} & \frac{\mathrm{d}Z_{p}(s)}{\mathrm{d}s}=\sum_{p\neq q}g_{pq}Z_{p}(s)Z_{q}(s)-d_{p}^{\prime\prime}Z_{p}(s).\qquad\qquad\qquad\quad \end{align*}\end{document}In these equations, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} X_{i}(s)\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} Y_{l}(s)\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} Z_{p}(s)\end{document} denote expression levels of AS events, RBPs, and TFs, respectively, at pseudotime \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} a_{ij}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} b_{il}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} c_{lk}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} e_{lp}\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} g_{pq}\end{document} are dynamic regulatory coefficients, whereas \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} d_{i}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} d_{l}^{\prime}\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} d_{p}^{\prime\prime}\end{document} represent self-degradation rates. Because AS depends on RBPs and RBPs depend on TFs, only the intermediate terms appear in Equations 13. The system assumes sparse connectivity consistent with biological networks, and therefore the parameters are estimated using Bayesian Lasso regression, which explicitly enforces sparsity in high-dimensional parameter estimation (Text S4).
Control of the regulatory network
Structural control theory was employed to identify key TFs that regulate EMT-associated AS events. The CTC framework determines the smallest set of driver nodes (TFs) required to control a given set of target nodes (AS events) (Text S5). Within CTC [34], the system is represented as:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*}& \begin{cases} \dfrac{\mathrm{d}x}{\mathrm{d}t}=Ax+Bu,\\[3pt] y=Cx, \end{cases}\end{align*}\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x\in \mathbb R^{N}\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} y\in \mathbb R^{N_{o}}\end{document} denote state variables (nodes) and target outputs. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} A\in \mathbb R^{N\times N}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} B\in \mathbb R^{N\times \phi }\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} C\in \mathbb R^{N_{o}\times N}\end{document} represent the state-transition, input, and output matrices.
Let \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} V={v_{1},\dots ,v_{N}}\end{document} denote all nodes, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} O\end{document} the target set, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} U\end{document} the constrained control set. CTC identifies the smallest subset \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\subseteq U\end{document} satisfying:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*}& \operatorname{rank}\!\bigl([CB, CAB, CA^{2}B,\dots,CA^{N-1}B]\bigr)=N_{o}.\end{align*}\end{document}When Equation (5) holds, the system \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} (A,B,C)\end{document} is said to be constrained-target controllable. Structural controllability is achieved when nonzero entries of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} A\end{document} can be freely assigned, ensuring
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*}& \max\,\Bigl\{\operatorname{rank}\,\bigl([CB, CAB, CA^{2}B,\dots,CA^{N-1}B]\bigr)\Bigr\}=N_{o}.\end{align*}\end{document}CTC extends both Kalman and classical target controllability. In this study, EMT-related AS events were defined as target nodes, and TFs were treated as constrained control nodes. A greedy iterative algorithm constructs bipartite graphs to delineate controllable subsystems. The Hopcroft–Karp algorithm then computes maximum matchings and determines control paths from TFs to AS events. The minimal set of driver TFs is identified using a minimum coverage approach and optimized via branch-and-bound linear programming. Finally, Markov-chain sampling produces alternative maximum-matching configurations to evaluate robustness. Figure 2 conceptually summarizes this process, in which different matching configurations yield distinct control paths, allowing robust identification of key TFs across network realizations.
Results and discussion
Testing method with a synthetic dataset
To evaluate performance under cross-sectional sampling (Text S6), we simulated \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 2\end{document} TFs, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 3\end{document} RBPs, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 5\end{document} AS events across \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 100\end{document} specimens. Figure 3a–c displays the original temporal profiles, and Fig. 3d–f presents randomly permuted cross-sectional inputs used by CTAS. Figure 3j shows that the inferred pseudotime closely matches the ground truth (Spearman’s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \rho =0.99946\end{document} ). Figure 3g–i demonstrates that recovered TF/RBP/AS dynamics resemble the original trends. Figure 3k reports an AUC of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 89.915%\end{document} for network reconstruction via Bayesian Lasso, indicating accurate recovery of regulatory structure.
Synthetic dataset demonstration of model capabilities. (a–c) Original TF/RBP/AS expression; (d–f) cross-sectional inputs after random permutation; (g–i) recovered dynamics along inferred pseudotime; (j) correlation between inferred and true pseudotime (Spearman’s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}); (k) ROC–AUC for network inference.
To assess robustness, we perturbed TF/RBP/AS values with multiplicative exponential noise (mean \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mu \end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 0\end{document} %– \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 10%\end{document} ) and varied the coefficient of variation (CV) from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 0%\end{document} to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 20%\end{document} . The EMT trajectory was retained and sample IDs were shuffled to emulate cross-sectional acquisition. Figure 3a and b summarizes pseudotime accuracy measured by root mean square error (RMSE) and Spearman’s rank correlation coefficient ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \rho \end{document} ), and Fig. 3c–f summarizes network metrics including area under the receiver operating characteristic curve (AUC), accuracy, positive predictive value (PPV), and Matthews correlation coefficient (MCC), all indicating stable performance.
We further formalized robustness as follows.
Theorem 1.1.Assume two pseudotime trajectories \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s(r)\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \tilde{s}(r)\end{document} share root \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} r\in I=[0,1]\end{document} and define \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} |\tilde{s}-s|{L^{2}}=(\int {I}|\tilde{s}-s|^{2}\mathrm{d}r)^{1/2}\end{document} . If \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} (X{i}(s),Y{l}(s),Z_{p}(s),a_{ij},b_{il},c_{lk},e_{lp},g_{pq})\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} (X_{i}(\tilde{s}),Y_{l}(\tilde{s}),Z_{p}(\tilde{s}),\tilde{a}{ij},\tilde{b}{il},\tilde{c}{lk},\tilde{e}{lp}, \tilde{g}_{pq})\end{document} both satisfy the progression-dependent system
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*} & \frac{\mathrm{d}X_{i}(s)}{\mathrm{d}s}=\sum_{j\neq i}a_{ij}X_{i}(s)X_{j}(s)+\sum_{l=1}^{M}b_{il}X_{i}(s)Y_{l}(s)-d_{i}X_{i}(s), \end{align*}\end{document}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*}\ \ \ & \frac{\mathrm{d}Y_{l}(s)}{\mathrm{d}s}=\sum_{k\neq l}c_{lk}Y_{l}(s)Y_{k}(s)+\sum_{p=1}^{H}e_{lp}Y_{l}(s)Z_{p}(s)-{d^{\prime}}_{l}Y_{l}(s), \end{align*}\end{document}
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*}\ & \frac{\mathrm{d}Z_{p}(s)}{\mathrm{d}s}=\sum_{p\neq q}g_{pq}Z_{p}(s)Z_{q}(s)-{d^{\prime\prime}}_{p}Z_{p}(s),\qquad\qquad\qquad\ \end{align*}\end{document}
with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} i=1,\dots ,N\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} l=1,\dots ,M\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} p=1,\dots ,H\end{document} , and
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*} & \frac{\mathrm{d}X_{i}(\tilde{s})}{\mathrm{d}\tilde{s}}=\sum_{j\neq i}\tilde{a}_{ij}X_{i}(\tilde{s})X_{j}(\tilde{s})+\sum_{l=1}^{M}\tilde{b}_{il}X_{i}(\tilde{s})Y_{l}(\tilde{s})-\tilde{d}_{i}X_{i}(\tilde{s}),\end{align*}\end{document}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*} & \frac{\mathrm{d}Y_{l}(\tilde{s})}{\mathrm{d}\tilde{s}}=\sum_{k\neq l}\tilde{c}_{lk}Y_{l}(\tilde{s})Y_{k}(\tilde{s})+\sum_{p=1}^{H}\tilde{e}_{lp}Y_{l}(\tilde{s})Z_{p}(\tilde{s})-\tilde{d}^{\prime}_{l}Y_{l}(\tilde{s}), \end{align*}\end{document}
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*} & \frac{\mathrm{d}Z_{p}(\tilde{s})}{\mathrm{d}\tilde{s}}=\sum_{p\neq q}\tilde{g}_{pq}Z_{p}(\tilde{s})Z_{q}(\tilde{s})-\tilde{d}^{\prime\prime}_{p}Z_{p}(\tilde{s}),\qquad\qquad\qquad\quad \end{align*}\end{document}
with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} i=1,\dots ,N\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} l=1,\dots ,M\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} p=1,\dots ,H\end{document} , then we have
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*} & \lim_{\|\tilde{s}-s\|_{L^{2}}\rightarrow0}\left(\sum_{j=1}^{N}(\tilde{a}_{ij}-a_{ij})^{2}+\sum_{l=1}^{M}(\tilde{b}_{il}-b_{il})^{2}\right)=0, \end{align*}\end{document}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*} & \lim_{\|\tilde{s}-s\|_{L^{2}}\rightarrow0}\left(\sum_{k=1}^{M}(\tilde{c}_{lk}-c_{lk})^{2}+\sum_{p=1}^{H}(\tilde{e}_{lp}-e_{lp})^{2}\right)=0, \end{align*}\end{document}
and
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{align*}& \lim_{\|\tilde{s}-s\|_{L^{2}}\rightarrow0}\left(\sum_{q=1}^{H}(\tilde{g}_{pq}-g_{pq})^{2}\right)=0.\end{align*}\end{document}The proof is provided in Text S7. Figures 3 and 4 together show that CTAS accurately recovers pseudotemporal order and regulatory structure and remains stable under substantial variability.
Robustness assessment using synthetic noise. CV varied from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document} to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}. (a and b) RMSE and Spearman’s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document} for pseudotime; (c–f) AUC, accuracy, PPV, and MCC for network reconstruction.
Construction and analysis of regulatory networks
We applied CTAS to the breast cancer dataset in [33]. Pseudotime ordering placed epithelial samples at the trajectory start and mesenchymal samples at the end (Table S4), whereas pathological stage showed little alignment. Trend analysis then scored TFs/RBPs/AS events, and we selected the top 50 AS events, 10 RBPs, and 10 TFs for modeling (Table S5). Bayesian Lasso estimated parameters in Equations 13. Figure 5 presents the TF regulatory network inferred from Equation (3), revealing both antagonistic and synergistic relations; e.g. HOXA1 and TWIST2 suppress PRDM8, whereas HOXA7 promotes it.
The TF regulatory network.
Figure 6 summarizes the integrated TF–RBP–AS network derived from Equations 13. The network uncovers hierarchical TF \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \to \end{document} RBP \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \to \end{document} AS cascades and both one-to-many and many-to-one control patterns across layers, explaining how coordinated TF–RBP interactions achieve precise splicing regulation.
The TF-RBPs-AS regulatory network.
Control and analysis of regulatory networks
Table 1 lists key TFs identified by CTC when the constrained set comprises the top 5 TFs (by trend score) and the targets comprise the top 20 AS events; Markov-chain sampling highlights HOXA3, PRDM8, and TWIST2 (frequency \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} >0.5\end{document} ). Table S6 reports cytoHubba/MCC results, with HOXA7, HOXA1, PRDM8, and TWIST2 ranked highest.
Figure 7 shows reconstructed dynamics for nine AS events; all exhibit significant differences (Wilcoxon rank-sum, two-tailed, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} P<.05\end{document} ). Specifically, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{RABGAP1L_{2}}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{NA_{11}}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{GIT2_{1}}\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{NF1_{2}}\end{document} are up-regulated, while \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{CCDC50}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{OSBPL8}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{FGFR1_{4}}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{MBNL2}\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{EPB41_{5}}\end{document} are down-regulated. Table 2 reports a complementary MCC \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \rightarrow \end{document} CTC analysis (six TF constraints and AS targets), nominating PRDM8 as a likely master regulator.
Reconstructed expression dynamics of nine AS events.
Together, Tables 1 and 2 nominate HOXA3, PRDM8, and TWIST2 as EMT-associated regulators with literature support, suggesting potential therapeutic relevance.
Biological functional analysis
Figure 8a summarizes GEPIA2 expression ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} |\log _{2}\mathrm{FC}|>1\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} q<0.01\end{document} ): HOXA3 is up-regulated in GBM, KIRP, PAAD, STAD, and THYM; PRDM8 in LAML and PAAD; TWIST2 in HNSC. Figure 8b presents overall-survival associations (Mantel–Cox): HOXA3 is high-risk in KIRC, LGG, and LUAD; PRDM8 in LUAD and UVM; TWIST2 in GBM, KIRP, LUSC, and UVM.
The result of gene expression profile and survival analysis. (a) Differential expression across cancers; (b) overall-survival significance map (HR on \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document} scale).
Figure 9 reports GO enrichment for genes co-varying with each TF (GEPIA2 selection; Metascape analysis). Genes similar to HOXA3 are enriched for anterior/posterior patterning, glycosyl-compound catabolism, ripoptosome, and apoptosis (Fig. 9a); genes similar to PRDM8 for plasma-membrane cytoplasmic side, cell–substrate junction, and junction organization (Fig. 9b); and genes similar to TWIST2 for extracellular matrix, skeletal development, collagen metabolism, and angiogenesis (Fig. 9c).
Enrichment analysis of pathways and biological processes was performed using Gene Ontology (GO) Biological Processes. Enriched terms for genes similar to HOXA3 (a), PRDM8 (b), and TWIST2 (c).
Regulation of alternative splicing events of the CD44 gene in breast cancer
The CD44 pre-mRNA contains 19 exons, nine alternatively spliced; prior work links CD44 splicing to EMT and metastasis [41–43]. Figure 10 shows reconstructed dynamics for nine CD44 AS events; two ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{CD44_{4}}\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mathrm{CD44_{8}}\end{document} ) are not significant ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} P>.05\end{document} ), so network analysis focuses on the remaining seven. Figure 11 displays the CD44 TF/RBP regulatory network built from the top 20 TFs/RBPs selected by trend analysis.
Reconstructed expression dynamics of CD44 gene.
The CD44 regulatory network.
Table 3 summarizes CTC with 20 TF constraints and 7 AS targets, nominating ZNF521 as a key driver. Table 4 reports an MCC-preselected analysis (14 TF constraints and same 7 targets), nominating HIC1. Together, the CD44 case indicates ZNF521 and HIC1 as candidate regulators supported by the literature.
Conclusion
In this paper, we introduced CTAS, a network control theory-based framework that integrates pseudotime ordering, trend analysis, sparse directed network inference, and control-theoretic screening to uncover TFs that control AS during EMT. By explicitly modeling the hierarchical TF \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \rightarrow \end{document} RBP \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \rightarrow \end{document} AS cascade, CTAS provides a principled way to convert static cross-sectional data into dynamic regulatory insights. Through both simulations and application to a breast cancer EMT cohort, CTAS demonstrated high accuracy in reconstructing pseudotime trajectories and directed networks, and successfully identified biologically supported TFs and subnetworks that regulate AS programs.
The main advantage of CTAS lies in its ability to bridge cross-sectional data and temporal dynamics, thereby offering a novel perspective to prioritize regulators that are most capable of driving splicing changes. This not only deepens our understanding of AS regulation in EMT but also provides experimentally testable hypotheses for cancer biology. Looking forward, CTAS can be extended in several directions. Future work will adapt the framework to more complex trajectories, such as branching processes captured by single-cell transcriptomics, and incorporate perturbation data to strengthen causal inference. In addition, applying CTAS to other biological contexts beyond EMT will help generalize its utility for studying dynamic regulatory programs. Overall, CTAS opens a new avenue for integrating network control theory with omics data to dissect layered regulatory cascades and identify key molecular drivers of AS.
Key Points
- This paper proposes a novel framework to identify key regulatory transcription factors (TFs) of alternative splicing(AS) during epithelial–mesenchymal transition based on network control theory.
- This paper develops a new method (CTAS) and compares it with state-of-the-art computational models to illustrate the superiority of the proposed approach. In response to the limitations of existing methods, CTAS reconstructs the hierarchical regulatory relationships among TFs, RNA-binding protein, and AS events to improve regulatory inference.
- CTAS further integrates pseudotime analysis and dynamic network modeling to overcome the limitations of cross-sectional data and uncover latent regulatory mechanisms.
- Biological validation and theoretical analysis are conducted to demonstrate the reliability and interpretability of the proposed framework. Experimental results show that CTAS has strong robustness and generalization ability across simulated and real biological datasets.
Supplementary Material
S1-Dataset_bbag042
S2-Pseudotime_analysis_bbag042
S3-Temporal_trend_analysis_bbag042
S4-Model__bbag042
S5-Detailed_algorithm_bbag042
S6-Robustness_analysis_bbag042
S7-Testing_method_with_a_synthetic_dataset_bbag042
supp_table_bbag042
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Valastyan S, Weinberg RA. Tumor metastasis: molecular insights and evolving paradigms. Cell 2011; 147:275–92.22000009 10.1016/j.cell.2011.09.024PMC 3261217 · doi ↗ · pubmed ↗
- 2Kalluri R, Weinberg RA. The basics of epithelial-mesenchymal transition. J Clin Invest 2009; 119:1420–8. 10.1172/JCI 3910419487818 PMC 2689101 · doi ↗ · pubmed ↗
- 3Nieto MA, Huang RYJ, Jackson RA. et al. EMT: 2016. Cell 2016; 166:21–45. 10.1016/j.cell.2016.06.02827368099 · doi ↗ · pubmed ↗
- 4Zeisberg M, Neilson EG. Biomarkers for epithelial-mesenchymal transitions. J Clin Invest 2009; 119:1429–37. 10.1172/JCI 3618319487819 PMC 2689132 · doi ↗ · pubmed ↗
- 5Mani SA, Yang J, Brooks M. et al. Mesenchyme Forkhead 1 (FOXC 2) plays a key role in metastasis and is associated with aggressive basal-like breast cancers. Proc Natl Acad Sci USA 2007; 104:10069–74.17537911 10.1073/pnas.0703900104 PMC 1891217 · doi ↗ · pubmed ↗
- 6Yang J, Mani SA, Donaher JL. et al. Twist, a master regulator of morphogenesis, plays an essential role in tumor metastasis. Cell 2004; 117:927–39.15210113 10.1016/j.cell.2004.06.006 · doi ↗ · pubmed ↗
- 7Roy B, Haupt LM, Griffiths LR. Alternative splicing (AS) of genes as an approach for generating protein complexity. Curr Genomics 2013; 14:182–94. 10.2174/138920291131403000424179441 PMC 3664468 · doi ↗ · pubmed ↗
- 8Pan Q, Shai O, Lee LJ. et al. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 2008; 40:1413–5.18978789 10.1038/ng.259 · doi ↗ · pubmed ↗
