Construction of a prognostic model for colorectal cancer liver metastasis based on single-cell transcriptomics and regulation of the MIF pathway
Jianqiang Cao, Zhaobin He, HuiJie Gao, Yongzhe Yu, Shengbiao Yang, Xiqiang Wang, Jun Niu, Cheng Peng

TL;DR
This study builds a prognostic model for colorectal cancer liver metastasis using single-cell data and explores the MIF pathway's role in tumor spread.
Contribution
A novel prognostic model for colorectal cancer liver metastasis based on single-cell transcriptomics and MIF pathway regulation.
Findings
TPM2, RPS17, and TNNT1 are significantly elevated in metastatic colorectal cancer epithelial cells.
SPINK4 expression is reduced in colorectal cancer with liver metastasis.
The MIF pathway interacts with CD74 and CXCR4 to promote tumor invasion and migration.
Abstract
Colorectal cancer is associated with a generally poor prognosis, primarily due to its often late diagnosis and the high propensity for liver metastasis. Current treatment strategies emphasize personalized approaches, integrating advanced targeted therapies based on specific molecular profiles to enhance outcomes. Continued research into molecular targets and innovative treatments is crucial for improving survival rates and managing disease progression. We retrieved single-cell transcriptomic and bulk RNA-seq data from colorectal cancer samples in the GEO and TCGA databases. The analysis focused on changes in pathway and gene expression in epithelial cells during the metastatic progression. A prognostic risk model was developed based on differentially expressed genes, and experimental validation confirmed the differential expression of prognostic-related genes in colorectal cancer…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMacrophage Migration Inhibitory Factor · Nuclear Receptors and Signaling · Galectins and Cancer Biology
Introduction
1
Colorectal cancer is a significant cause of both morbidity and mortality worldwide. By 2040, projections estimate that the incidence and mortality rates will rise to 3.2 million and 1.6 million cases, respectively (1). Liver metastasis is an important clinical feature of colorectal cancer and leads to an inferior prognosis. The five-year survival rate for patients with non-metastatic colorectal cancer is around 60%. In contrast, for metastatic colorectal cancer, particularly in stage IV cases with liver metastasis, the five-year survival rate drastically decreases to about 12% (2).Treatment for liver-metastatic colorectal cancer typically involves a combination of surgery, systemic therapies, and targeted liver treatments. Unfortunately, around 40%-50% of these patients are inoperable at the time of diagnosis, leaving palliative chemotherapy as the only treatment option. Introducing innovative targeted therapies has become crucial in the personalized, comprehensive management of colorectal cancer (3).
According to a meta-analysis, for patients with unresectable liver-metastatic colorectal cancer, combining chemotherapy with targeted therapies increases the overall response rate to 68%, compared to 43% with chemotherapy alone (4). In current clinical practice, primary treatments for these patients include anti-epidermal growth factor receptor (anti-EGFR) antibodies and anti-vascular endothelial growth factor (anti-VEGF) antibodies, such as cetuximab and panitumumab, which are known to provide additional survival benefits (5–7). Research by Modest and colleagues has demonstrated that KRAS-targeted therapies can lead to early tumor shrinkage rates exceeding 20%, significantly improving disease-free and overall survival (8). Targeting the CLDN1 gene has been shown to reduce colorectal cancer cell growth and survival, effectively inhibiting cell proliferation and migration (9). Moreover, Yuan-Hong Xie’s research indicates that overexpression of the MET and HGF genes is associated with poor prognosis in colorectal cancer patients. Additionally, interactions between MET and EGFR can result in compensatory activation of one receptor when the other is inhibited, suggesting that MET may play a role in resistance to EGFR inhibitors (10). Identifying additional critical pathways and genes, along with developing therapeutics targeting these areas, is essential for improving the prognosis of colorectal cancer patients and reducing the incidence of metastases.
In recent years, the advancement of single-cell sequencing technologies has led to a growing number of studies integrating scRNA-seq and bulk RNA-seq data to develop prognostic models for colorectal cancer (11). These studies have substantially propelled progress in translational research within this domain. The progression of colorectal cancer typically follows a continuous trajectory, moving from normal tissue to malignancy and eventually to metastasis. This document highlights research into the essential biological processes involved in tumor development, starting with single-cell transcriptomic analyses, focusing on examining and comparing the significant changes in genes, pathways, and metabolic processes in primary colorectal cancer tissues with or without liver metastasis. Additionally, it explores prognosis-associated genes in epithelial tissues. It constructs a prognostic model, offering valuable insights into the malignant evolution of colorectal cancer and helping to inform clinical treatment strategies.
Materials and methods
2
Clustering and marker gene identification in single-cell data
2.1
The single-cell transcriptomic data was sourced from the GEO database hosted by the NCBI website. This research specifically utilized data from GSE161277 and GSE178318, following the clinical information provided for each patient cohort (12, 13). The GSE161277 dataset includes samples of normal and non-metastatic colorectal cancer tissues, while the GSE178318 dataset contains samples from metastatic colorectal cancer and liver metastases. We analyzed a total of 19 samples, including 3 Normal, 4 Colorectal Cancer, 6 Colorectal Cancer with Metastasis to the Liver, and 6 Liver Metastasis, with 9,000 cells randomly selected from each sample for downstream analyses. Data integration across different disease states was achieved using the Read 10x and CreatSeuratObject packages, with genes showing low expression levels (defined as expression in at least three cells with a count exceeding 200) excluded (14).
The raw data quality was controlled using R software version 4.2.3, which excluded cells where mitochondrial gene ratios exceeded 20% and erythrocyte gene ratios exceeded 3%. After normalization, the filtered data were analyzed to identify highly variable genes. The Harmony package eliminated batch effects and performed principal component analysis (PCA) (15). The elbow plot function was used to select the top 20 principal components for analysis, with a resolution of 0.5 set for clustering. Differentially expressed genes (DEGs) among different cell types were identified using the Wilcoxon rank sum test via the “FindMarkers” function. Statistically significant genes were defined as those with a Bonferroni-corrected p-value below 0.05. Genes with a log-fold change (logFC) greater than one were selected, indicating that their average expression level within the cluster was more than double that of other clusters. Cell types within each cluster were annotated using established marker genes (16–21).
Gene ontologyand kyoto encyclopedia of genes and genomes functional enrichment analysis
2.2
The cluster profile package in R was used for GO and KEGG pathway enrichment analyses (22). These analyses were performed with a log-fold change (logFC) threshold of 1, focusing on pathways with P-values below 0.05.
Prognostic model construction based on TCGA and GEO cohorts
2.3
RNA-seq count data from colorectal cancer patients were downloaded from the official TCGA site. These counts were converted into transcripts per million (TPM) to normalize marker gene expression levels across patients. The Cox proportional hazards model was applied to assess the relationship between key genes and overall survival, considering age, sex, tumor staging, and risk scores (23). After removing duplicate entries and excluding cases with incomplete clinical information, prior radiotherapy or chemotherapy, concurrent malignancies, or missing survival data, a total of 515 patients were included in the analysis and designated as the training cohort for prognostic model development. In parallel, transcriptomic and clinical data from the GSE12945 and GSE29623 datasets in the Gene Expression Omnibus (GEO) database were retrieved. Applying the same inclusion and exclusion criteria, 127 samples were included as the validation cohort.
Analysis of single-cell trajectories
2.4
Single-cell trajectories were constructed using the Monocle2 R package, with the orderCells function used to sequence the cells (24). This approach provided insights into the continuity of cellular state transitions, resulting in a time-oriented dendrogram. Generally, the progression from the dendrogram’s root to its branches represents the flow of time, with cells along this timeline exhibiting distinct states, possibly including various subgroups.
Cell-cell communication analysis
2.5
CellChat was employed to infer cellular communications from single-cell RNA data. This analysis identified the signaling molecules and receptors involved, quantified the interaction strengths between cells, and highlighted key nodes and pathways within the communication network (25).
Human tumor specimens
2.6
The Shandong University Qilu Hospital system procured tumor specimens from ten patients diagnosed with colorectal cancer, five of whom also had liver metastases. Importantly, none of these individuals had undergone any pre-surgical treatments, including radiotherapy, chemotherapy, or immunotherapy. After collection, each tumor specimen was carefully preserved at -80 °C. Diagnostic evaluations were conducted by two pathologists, both highly experienced in their field. Comprehensive clinical data were gathered for all enrolled patients, and informed consent was obtained from each participant. The ethical integrity of this study was approved by the Ethics Committee of Qilu Hospital of Shandong University, ensuring that all procedures strictly followed the principles outlined in the Declaration of Helsinki.
Western blotting
2.7
Cellular proteins were extracted using a DSS lysis buffer containing protease and phosphatase inhibitors. Protein concentrations were quantified using a BCA assay kit, and samples were separated on an SDS-PAGE gel. The proteins were then electrophoretically transferred to a PVDF membrane. After blocking with skim milk for one hour, the membrane was washed three times with TBST and incubated overnight at 4 °C with the primary antibody. Following three additional washes with TBST, the membrane was incubated at room temperature for one hour with the secondary antibody. Bands were visualized using chemiluminescent detection reagents. The antibodies used were: Rabbit anti-SPINK4 from Abcam (catalog #AB175929) at a 1:2000 dilution, Rabbit anti-RPS17 from Abclonal (catalog #A16426) at a 1:1000 dilution, Rabbit anti-TNNT1 from Abclonal (catalog #A10354) at a 1:1000 dilution, Rabbit anti-TPM2 from Abclonal (catalog #A3096) at a 1:700 dilution, and Mouse anti-GAPDH from Proteintech (catalog #60004-1-Ig) at a 1:5000 dilution.
Immunohistochemical staining
2.8
Cancer tissues were fixed overnight in a 4% formaldehyde solution, followed by embedding and sectioning. The sections were deparaffinized with xylene and rehydrated through a graded series of ethanol solutions. After antigen retrieval and blocking, the sections were incubated with the primary antibody, followed by washes of TBST. They were then incubated with the secondary antibody and stained with hematoxylin to visualize the nuclei. Microscopic examination of the stained sections provided insights into the expression and distribution of the protein. The antibodies used in this study included Rabbit anti-TPM2 from Abclonal (catalog #A3096) at a 1:100 dilution, Rabbit anti-RPS17 from Abclonal (catalog #A16426) at a 1:100 dilution, Rabbit anti-TNNT1 from Bioss (catalog #bs-10616R) at a 1:200 dilution, and Rabbit anti-SPINK4 from Immunoway (catalog #YN3682) at a 1:100 dilution.
Statistical analysis
2.9
All single-cell data analyses and statistical tests were conducted using the R programming language. Statistical significance was defined as a p-value below 0.05. Cox regression was employed for survival analysis to develop predictive models.
Results
3
Single-cell atlas and cell lineage annotation in colorectal cancer tissue
3.1
Based on single-cell transcriptomic analysis of four tissue samples, namely NOR(Normal), CRC(Colorectal Cancer), CRCM(Colorectal Cancer with Metastasis to the Liver), and LM(Liver Metastasis), the results show uniform Manifold Approximation and Projection (UMAP) clustering identified 16 distinct clusters (Supplementary Figure 1A). Analyzing sample identifiers within the dataset confirmed the successful mitigation of batch effects (Supplementary Figure 1B). We categorized all cells into seven main types: epithelial cells, T cells, B cells, macrophages, plasma cells, fibroblasts, mast cells (Figure 1A), and each cluster’s cell type was annotated using canonical marker genes (Figure 1B).
Single-cell atlas of colorectal cancer tissue. (A) Annotations for seven different types of cells in the tissue samples of NOR, CRC, CRCM, and LM. (B) Bubble chart annotating marker genes corresponding to the four tissue types to characterize cell types. (C) Annotations for seven different cell types in the tissue samples of CRC and CRCM. (D) Bubble chart marking marker genes for the two tissue types to annotate cell types. (E) Cluster13 has high similarity with cluster1 and cluster0. (F) Non-negative least squares calculations suggest that cluster 13(C13), cluster0(C0), cluster1(C1) share similar cellular properties with clusters0(T cell), identifying them as T immune cells. (G) Distribution of various cell types across the four tissue samples.
Further analyses focused on the cell populations at the primary sites of colorectal cancer (both CRC and CRCM) to identify cellular and genetic changes during the metastatic process. Cells were reanalyzed using the standard Seurat pipeline, resulting in 18 clusters (Supplementary Figure 1C). Seven cell types were identified (Figure 1C), and each cluster was annotated using canonical marker genes to classify the cell types (Figure 1D). Cells from both sources were evenly mixed, indicating successful elimination of batch effects (Supplementary Figure 1D). Cluster 13, initially grouped with epithelial cell markers, was reclassified as a T cell type based on gene markers from a bubble plot. Spearman’s correlations between clusters were calculated, and a heatmap revealed a close relationship between Cluster 13 and Clusters 0 and 1 (Figure 1E). Non-negative least squares regression was used to explore correlations between all cells and the CRC&CRCM cells clusters, further detailing the relationships among cluster subtypes (26). A heatmap demonstrated that Cluster 13, Clusters 0 and 1 are closely associated with C0(Tcell) in the overall dataset, predominantly composed of T cells, confirming that Cluster 13 is likely made up of T cells rather than epithelial cells (Figure 1F). The cell proportion diagram shows that T and epithelial cells are the predominant cell types across all samples (Figure 1G). Notably, from non-metastatic colorectal cancer to metastatic states and liver metastases, there is a progressive increase in T cells, accompanied by a decrease in epithelial and B cells. In recent years, accumulating evidence has highlighted the pivotal role of the tumor immune microenvironment and the dynamic shifts in immune cell populations in the initiation and progression of colorectal cancer (27, 28), the tumor microenvironment exerts a profound influence on the recruitment and activation of tumor-infiltrating lymphocytes, which may underlie the observed immune cell alterations across different disease stages (29).
The prognostic risk-scoring signature predicted that the expression of four genes was strongly correlated with the patient’s prognosis
3.2
Using R’s FindMarkers function, differential gene expression among various cell types in CRC and CRCM samples was analyzed, focusing on the differing pathological conditions of the samples. This function helps identify genes that show significant expression differences between two groups. The thresholds for identifying significant differential expression were set at p < 0.05 and log fold change (log FC) > 1. Differentially expressed genes (DEGs) were identified for each cell type, with particular attention given to the top five upregulated and downregulated genes (Figure 2A). In the comparative analysis of epithelial cells from two different sample sets, 243 DEGs were identified. A detailed Gene Ontology (GO) enrichment analysis was then performed to explore changes in colorectal cancer tissue states post-metastasis. The results showed that the related genes were primarily enriched in pathways involved in inflammatory processes, metabolic functions, cell proliferation, migration, and specific enzymatic activities (Figure 2B). Additionally, the AUCell algorithm was used to quantify changes in pathway activities related to colorectal cancer tissues before and after metastasis (30). This analysis highlighted a significant increase in pathway activities associated with inflammation, metabolism, cell proliferation, migration, and responses to external stimuli during the metastatic transition (Supplementary Figure 2). Gene Set Enrichment Analysis (GSEA) revealed multiple pathways significantly enriched in metastatic colorectal cancer lesions compared to non-metastatic lesions (Supplementary Figure 3). The figures illustrate that a sharp rise in the enrichment score (ES) curves indicates a coordinated upregulation of gene sets in metastatic colorectal cancer samples, while a gradual decline indicates downregulation.
Analysis of differences between CRC and CRCM tissues. (A) The differential genes among seven different cell types in two types of tissue, marking the top five upregulated and downregulated genes. (B) The Gene Ontology (GO) pathways enriched by differential genes, predominantly associated with inflammation, metabolism, and proliferation migration. (C) The changes in metabolic activity across seven cell types in two tumor samples. (D) UMAP plots of epithelial cells from the two samples.
A prognostic analysis model was developed for the 243 differentially expressed genes (DEGs) identified in epithelial cells. CRC expression data from GSE12945 and GSE29623 in the GEO database were downloaded and merged after eliminating batch effects for validation (31, 32). The training set utilized FPKM and clinical data from the TCGA database, excluding samples with an overall survival (OS) of zero. The integration of GEO and TCGA databases narrowed the list to 198 DEGs for further analysis. Univariate analysis identified 15 prognosis-related DRGs (Figure 3A), which were subjected to multivariate Cox regression analysis, hazard ratios (HRs) were visualized using a forest plot. Next, a RS(Risk Score) signature of four DEGs (TPM2, RPS17, TNNT1, and SPINK4) was created. Next, the RS of each sample was computed based on relative coefficient and DEGs expression. RS was calculated as follows: RS=(0.163822904780406)* expression of TPM2+(0.19795829828391)* expression of. RPS17+(0.140084452797051)* expression of TNNT1+(-0.0845093126747071)* expression of SPINK4. The TCGA dataset was used to train the model, which calculates patient risk scores based on gene expression. Patients were classified into high-risk and low-risk groups using RS median through a Cox regression prognostic model. The same median was applied to the GEO dataset for validation, segmenting patients into high- and low-risk groups. The training cohort comprised 515 patients, with 257 classified as high-risk and 258 as low-risk. The validation cohort consisted of 127 patients, including 71 high-risk and 56 low-risk cases. Survival curves were plotted for the training and validation sets to assess prognostic differences between the high-risk and low-risk groups. In the training set, significant prognostic differences were observed between the two groups (p < 0.001) (Figure 3B left). Similarly, in the validation set, the prognosis for high-risk patients was notably worse compared to the low-risk group (p < 0.05) (Figure 3B right). The model’s predictive performance was further validated using the area under the ROC curve (AUC), with AUC values of 0.676, 0.665, and 0.685 in the training set and 0.698, 0.626, and 0.661, in the validation set (Figure 3C). These metrics demonstrate the model’s substantial predictive accuracy.
Prognostic analysis of differentially expressed genes in epithelial cells. (A) Forest plot displaying prognostically significant genes identified from the training cohorts. (B) The differential survival rates between high-risk and low-risk groups across the validation and training cohorts. (C) Computes the area under the ROC curve for the survival model in both the validation and training sets, assessing predictive performance. (D) Forest plots for univariate and multivariate prognostic analyses, respectively, highlighting the influence of various factors on outcomes. (E) Nomogram to estimate the 1-year, 3-year, and 5-year survival probabilities based on identified risk factors. (F) Calibration curve evaluating the predictive accuracy of the survival analysis model.
Univariate and multivariate prognostic analyses using clinical data and risk scores from the TCGA database were conducted to identify risk factors impacting patient survival. These analyses revealed significant associations between age, cancer staging, and risk scores with survival rates (p < 0.001), underscoring the statistical significance and independent prognostic value of the risk score (Figure 3D). The cohort consisted of 84 patients with stage I disease, 190 patients with stage II disease, and 123 patients with stage III disease. A predictive nomogram was developed based on the identified prognostic factors to estimate survival probabilities for patients with metastatic colorectal cancer (Figure 3E). The nomogram consists of eight axes; the first four represent the variables of the predictive model. A vertical line drawn across these axes allows for calculating the risk score for each factor by summing the individual scores. The last three axes indicate one-year, three-year, and five-year survival probabilities. Calibration curves showed a strong correlation between the predicted and observed survival rates, validating the nomogram’s accuracy in predicting patient outcomes (Figure 3F).
Western blot analysis revealed higher expression levels of TPM2, RPS17, and TNNT1 genes in CRCM tissues compared to primary CRC, whereas SPINK4 expression was slightly reduced in CRCM tissues (Figures 4A, B). Immunohistochemical experiments were conducted to confirm further these genes’ distribution and expression in cancer tissues. The results demonstrated significantly increased expression of RPS17 and TNNT1 in CRCM tissue samples compared to CRC tissues. Changes in TPM2 expression were less pronounced, possibly due to limited sample size, which did not yield statistically significant differences. Additionally, SPINK4 expression showed a declining trend in CRCM tissues, corroborating the Western blot findings (Figure 4C).
*Verification of differentially expressed genes expression in tumor tissues. (A, B) The comparative protein expression levels of four high-risk prognostic genes in CRC and CRCM tissues. (C) Immunohistochemistry detailing the distribution and expression intensity of four genes (10×20). Indicates that the difference is statistically significant (P < 0.05).
Pseudotime analysis revealed changes in the expression of four genes as the disease progressed
3.3
Trajectory analysis of epithelial cells from colorectal cancer, with and without metastasis, was conducted. Using Monocle2, the cell trajectories were divided into three distinct states, outlining the differentiation paths of these cells following dimensionality reduction clustering. The epithelial cell differentiation trajectory branches off from the main fork, extending from the upper left toward the right in the figure (Figure 5A upper).In the pseudotime analysis, States 1 and 2 correspond to non-metastatic colorectal cancer cells, while State 3 includes both metastatic and non-metastatic cells (Figure 5A lower).
Epithelial cell pseudo-time analysis. (A) The developmental trajectories of cells and the distribution of cells from two sample states in the pseudo-time analysis. (B) The expression distribution of the RPS17, SPINK4, TNNT1, and TPM2 genes across two different tissue samples and three states. (C) The expression changes of four genes along a pseudo-time trajectory. (D) Comparative analysis of copy number variations (CNV) in epithelial cells between CRC, CRCM, and NOR samples.
Subsequent studies explored the transcriptomic variations of differentially expressed genes (DEGs) during epithelial cell differentiation. We focused on four genes with significant prognostic relevance, tracking their dynamic changes throughout this process. The expression levels of TPM2, RPS17, and TNNT1 progressively increased during the progression and metastasis of colorectal cancer (CRC); in contrast, SPINK4 expression initially rose before sharply declining (Figures 5B, C).
Chromosomal copy number variation
3.4
Studies indicate that in primary liver lesions and their corresponding metastatic liver sites, copy numbers for chromosomes 7, 8, and 20 increase, while chromosomes 1, 4, 8, 17, and 18 show a reduction. Comparative analyses between primary tumors and their metastatic liver counterparts reveal chromosomal changes, including gains in chromosomes 7, 8, 13, and 20 and losses in chromosomes 1, 8, 14, and 18.
In the comparative analysis of epithelial cells from tumor samples before and after metastasis (Figure 5D), chromosomes 6, 7, and 20 exhibited significant levels of copy number variation (CNV), suggesting that CNVs in these regions may be linked to the loss or amplification of tumor suppressor genes, oncogenes, or non-coding RNAs like miRNAs.
Cell-cell communication reveals that the MIF-CD74+CXCR4 ligand pair plays a role in colorectal cancer metastasis
3.5
Comprehensive analyses were performed on CRC and CRCM. The results indicated that before metastasis, epithelial cells and fibroblasts exhibited high levels of communicative activity. In contrast, B cells showed weaker interactions with other cell types (Figures 6A, B), suggesting the dominance of humoral immune responses at this stage. Fibroblasts were strongly connected with various cell types, highlighting their crucial role in tumor progression and the shaping of the tumor microenvironment.
Analysis of intercellular communication between CRC and CRCM tissues. (A, B) Comparative analysis of pathway quantity differences between two tissue types. (C) The differences in ligand-related pathway activities between CRC and CRCM tissues. (D) That the primary source of MIF ligand pairs is epithelial cells and fibroblasts, while the receptors are primarily B cells, T cells and macrophages. (E) Violin plot showing the expression levels of pathway molecules related to the MIF ligand across various cell types.
Bar graphs illustrated changes in pathway activities between CRC and CRCM tissues, with pathways such as CXCL, IFN-II, TNF, SPP1, THBS, and COMPLEMENT showing increased activity in CRCM tissues (Figure 6C). This rise indicates an intensified inflammatory state in these tissues, which promotes tumor angiogenesis and enhances metastatic potential. These findings highlight the critical roles these molecules play in tumor proliferation, invasion, immune evasion, and the shaping of the tumor microenvironment. Cell-cell interactions were analyzed based on ligand-receptor pairs for the seven cell types in each sample (Supplementary Figure 4A). In CRCM samples, specific ligand-receptor pairs, particularly MIF-CD74+CD44, MIF-CD74+CXCR4, and APP-CD74, were identified as particularly significant. Given the pronounced impact of the MIF pathway in epithelial and fibroblast cells, further analyses identified fibroblasts and epithelial cells as ligands, with T cells, B cells, and macrophages acting as receptors (Figure 6D). Bubble plots highlighted differences in the activities of receptor-ligand pairs involving epithelial and fibroblast cells across the two samples (Supplementary Figure 5A). Heatmaps comparing the number and intensity of pathways between these samples revealed a significant increase in the communication strength and activity of epithelial cells in CRCM tissues (Supplementary Figures 4B, C). This analysis disclosed dynamic interaction patterns, particularly the intensified communication of epithelial cells in metastatic colorectal cancer. The MIF pathway, especially the MIF-CD74+CXCR4 pair, was identified as a critical contributor, consistent with its key role in tumor progression and metastasis (33).
Subsequent analyses of the MIF pathway and its associated ligand interactions among different cell types revealed that MIF expression is primarily derived from epithelial cells and fibroblasts. The violin plots (Figure 6E) show that macrophages exhibited high CD44 and CD74 expression levels, while CXCR4 was predominantly expressed in T and B cells.
The PPI network emphasizes the weight value of CD74
3.6
A comprehensive protein-protein interaction (PPI) network analysis was conducted on 243 differentially expressed genes identified from an epithelial cell gene expression study (34). These genes were initially entered into the STRING database using the multiple protein options to generate a detailed PPI network, with the corresponding raw data subsequently preserved (Supplementary Figure 5B). The raw data were then visually transformed using Cytoscape software (35). Within Cytoscape, the network was analyzed using the CytoNCA plugin, which applies the “betweenness centrality” algorithm to predict and explore key nodes in the network. This algorithm measures the centrality of nodes in a network graph by assessing the shortest paths between them and identifying their importance.CD74 emerged as the node with the highest weighted value in the protein network (Supplementary Figure 5C). In the context of cellular communication, this finding confirmed the critical role of CD74-related signaling pathways in tumor metastasis, aligning with the results of the PPI network analysis.
Discussion
4
Liver metastasis is a pivotal clinical feature of colorectal cancer, often leading to a poor prognosis. Significant clinical interest is understanding the mechanisms driving this metastatic process and developing effective therapeutic strategies. Single-cell transcriptomics offers a powerful approach to dissecting tumor progression dynamics and identifying viable targets for personalized cellular therapies.
We detailedly analyzed gene expression changes in epithelial cells using single-cell transcriptomics. During the metastasis process, we observed significant increases in the expression of the TPM2, RPS17, and TNNT1 genes in colorectal cancer epithelial cells, while SPINK4 expression was notably reduced. These findings were confirmed through Western blot and immunohistochemical staining, suggesting that the overexpression of these genes may drive malignant behaviors such as migration, invasion, and liver metastasis in colorectal cancer. In our study, TNNT1 demonstrated consistent expression patterns across RNA-seq, Western blotting (WB), and immunohistochemistry (IHC), whereas the other examined genes exhibited only modest alterations at the protein level. Several factors may account for this discrepancy: the relatively small validation cohort, which reduced statistical power; inherent biological differences between mRNA and protein expression arising from post-transcriptional regulation; the limited sensitivity of WB and IHC for detecting subtle protein-level changes; and potential cell-type-specific expression patterns that may be masked in bulk tissue analyses. Such challenges are frequently encountered in integrative studies, highlighting the complexity of gene regulation during metastasis, as reflected in our WB and IHC results (Figure 5). Additionally, analyses of patient prognosis data from the GEO and TCGA databases revealed a strong correlation between these four genes and poor outcomes in our constructed prognostic models, supporting previous research on these genes in colorectal cancer tumors.
Troponin T, a protein with a molecular weight of 30–35 kDa and composed of 220–300 amino acids, primarily expressed in skeletal muscles, functions as part of a regulatory complex on filaments, playing a critical role in modulating muscle contraction and relaxation through calcium signaling. Elevated TNNT1 levels have been shown to drive proliferation, migration, and invasion in colorectal, thyroid, and uterine sarcoma cancers (36). It accelerates cancer progression by influencing the cell cycle and promoting E-cadherin during the epithelial-mesenchymal transition (EMT), thus facilitating tumor metastasis (37, 38). Studies by Yu Chen and colleagues have empirically established TNNT1’s oncogenic role in colorectal cancer progression (39). C. Jiang’s research shows that Tnnt1 overexpression markedly increases MMP-2 and MMP-9 protein levels in cells while suppressing E-cadherin expression (40). This regulation of the epithelial-mesenchymal transition (EMT) process reduces adhesion among tumor cells, promoting their migration and metastasis. Furthermore, TNNT1 overexpression induces cyclin D expression, accelerating the G1/S phase transition and reducing the activity of caspases 3 and 7—key apoptosis markers (41). This shortens the cell cycle, promotes cellular proliferation, and significantly enhances tumor growth. In TNNT1-knockdown cells, Xiaobin Ge et al. observed reduced levels of β-catenin and cMyc, suggesting that TNNT1 suppression may inhibit the Wnt/β-catenin pathway (42, 43). Additional research reveals that Tnnt1 strongly activates the p38/JNK signaling pathway, which, when aberrantly activated, contributes to excessive cell proliferation and reduced apoptosis, thereby facilitating the progression, migration, and invasion of various cancers (40).
TPM2 (β-tropomyosin) is a member of the tropomyosin family, encoding the β-chain with a molecular weight of approximately 32 kDa. It plays a crucial role in regulating muscle contraction by forming complexes with actin and troponin, orchestrating actin filament assembly, and controlling actin nucleation. Our research identified significant amplifications at the chromosome locus 9p. TPM2 is associated with unfavorable tumor prognoses, and prior studies have confirmed its function as an oncogenic glycoprotein that promotes proliferation and metastasis in colorectal cancer (44). It also exerts carcinogenic effects in other cancers, such as prostate cancer. Dysregulation of actin-binding proteins compromises the stability of the actin cytoskeleton, which may contribute to tumorigenesis (45, 46). We propose that increased TPM2 expression enhances cytoskeleton dynamics, altering cellular adhesion to the extracellular matrix (ECM) and modifying tumor cell interactions with the microenvironment, including angiogenesis and immune evasion, thereby increasing migratory and invasive capacities.
Although previous studies, such as Cui, reported downregulation of TPM2 in CRC cell lines, these findings were derived from in vitro models and may not fully capture the gene’s behavior in patient-derived tumor tissues (47). The choice of experimental model is critical: cell lines reflect intrinsic cellular behavior under controlled conditions, whereas patient-derived tumor samples encompass complex tissue interactions. Tumor stage also affects TPM2 expression and its prognostic relevance, and distinct molecular subtypes of CRC may regulate TPM2 through different transcriptional programs. Moreover, the tumor microenvironment—including stromal and immune components—can influence TPM2 levels and activity, and genetic alterations such as copy number variations may further modulate its expression and function. In our analysis of clinical CRC specimens, elevated TPM2 expression correlated with poorer prognosis. This finding aligns with recent evidence supporting a pro-metastatic role for TPM2: for instance, TPM2 can interact with Endomucin to promote proliferation and metastasis in CRC (48), and tumor-associated stromal cells overexpressing TPM2 enhance cancer cell proliferation, migration, and metastatic potential (49). Such discrepancies likely arise from multiple context-dependent factors. To further address these questions, our ongoing studies integrate single-cell and bulk transcriptomic analyses, coupled with detailed characterization of TPM2 across tumor compartments, aiming to elucidate its context-dependent role in CRC progression and metastasis.
Studies consistently show that SPINK4 mRNA and protein levels are significantly reduced in tumor tissues compared to controls. SPINK4 is prominently expressed in normal colorectal, small intestine, and stomach tissues, as well as in gastrointestinal cell lines, but its expression significantly declines in colorectal cancer (CRC) tissues (50). Regarding its prognostic value, a study by Xie et al. found no correlation between serum SPINK4 levels and either overall survival (OS) or disease-free survival (DFS) in CRC patients (51). This result may be due to the study’s brief follow-up period and limited sample size. However, investigations by Xiaojie Wang and colleagues revealed that lower SPINK4 expression correlates with stem-like features and undifferentiated states in CRC cells. They also confirmed that reduced SPINK4 protein levels in tissues are significantly associated with poorer survival rates in CRC patients (50). SPINK4 is an inhibitor of serine protease activity, indirectly regulating multiple signaling pathways and reducing cellular adhesion. Serine proteases facilitate tumor cell migration and invasion by degrading the extracellular matrix and inhibiting serine hydrolase activity. SPINK4 may block these processes by inhibiting these proteases, potentially preventing tumor metastasis.
Research on the RPS17 gene remains relatively limited. Data indicate that elevated RPS17 levels, as shown in prognostic risk models, are associated with poor outcomes (52). Additionally, studies by Meir et al. have highlighted substantial RPS17 expression in primary choroidal melanoma tissues and liver metastases, suggesting its potential role in tumor pathology (53). Previous studies have confirmed that elevated expression of RPS17 in colorectal cancer tissues is associated with poor prognosis in patients. Additionally, the upregulation of RPS17 correlates with a decrease in the tumor microenvironment (TME) score, suggesting a potential link between RPS17 expression and TME heterogeneity (11). Moreover, the increased expression of related genes, such as TPM2, raises the hypothesis that this may influence the interaction between tumor cells and the extracellular matrix, as well as the remodeling of the cytoskeleton. This regulation could modulate the expression of adhesion molecules or matrix metalloproteinases, thereby enhancing the ability of tumor cells to traverse the basement membrane and endothelial cells, facilitating their entry into the bloodstream or lymphatic system, and ultimately promoting their migratory and invasive capacities.
MIF is frequently overexpressed across a broad range of malignancies, with its expression levels positively correlated with tumor progression. CD74, a high-affinity receptor for MIF, mediates activation of the ERK signaling cascade, thereby promoting innate immune responses driven by macrophages and monocytes (54). In the context of tumor invasion and metastasis, MIF may enhance tumor cell motility through multiple mechanisms—for instance, by activating Rho GTPases and promoting the polymerization of filamentous actin to remodel the cytoskeleton, or by inducing epithelial–mesenchymal transition (EMT) via modulation of the TGF-β pathway (55). Gui-Qi Zhu and colleagues demonstrated that MIF promotes the expansion of immunosuppressive myeloid-derived cells, accelerating tumor progression and dampening antitumor immunity (33). Mechanistic studies have further shown that MIF facilitates osteosarcoma cell proliferation and metastasis by activating the RAS/MAPK signaling axis, with the degree of pathway activation closely tied to tumor burden and patient prognosis (56). Additionally, specific inhibitors such as 4-iodo-6-phenylpyrimidine have been employed to disrupt MIF signaling, markedly enhancing tumor cell sensitivity to chemotherapy, suppressing primary tumor growth and angiogenesis, and significantly reducing the incidence of distant metastasis (57). These findings highlight the potential of MIF-targeted interventions as a promising therapeutic strategy in colorectal cancer.
In our study, we found that MIF pathway activity is significantly elevated following hepatic metastasis of CRC. Epithelial cells and fibroblasts were identified as the primary sources of MIF signals, while macrophages, T cells, and B cells served as the main recipients. Among intercellular interactions, the MIF–CD74+CXCR4 axis emerged as a key communication pathway, showing particularly high activity in signaling between epithelial and B cells. Within the tumor microenvironment, macrophages and B cells contribute to tumor-associated paracrine signaling through the MIF pathway. Notably, macrophages act as central mediators in this signaling network, relaying MIF signals from epithelial cells and fibroblasts to lymphocytes—likely through the secretion of growth factors or cytokines that coordinate crosstalk between epithelial and immune cells. Notably, the influence of fibroblasts in controlling tumor tissue increases during metastasis, indicating that tumor-associated fibroblasts may release signaling molecules or alter the extracellular matrix to support tumor invasion and migration. This support may include promoting angiogenesis, modulating immune responses, and facilitating immune evasion.
Numerous studies have shown that the binding of MIF to its receptors activates the Src/PI3K signaling pathway, initiating a downstream kinase cascade in which AKT activation plays a central role (58). Once activated, AKT promotes the phosphorylation of various downstream targets, thereby modulating the molecular interactions between actin and myosin. For instance, activated LIM kinases phosphorylate actin-depolymerizing factors such as cofilin, suppressing their activity, which in turn stabilizes actin filaments and facilitates their dynamic elongation. These molecular alterations directly or indirectly affect the function of actin-binding proteins such as TPM2, as well as the structural stability and remodeling capacity of the cytoskeleton (59, 60). Based on this molecular framework, we propose that TPM2 activity, regulated by the MIF–mediated Src/PI3K signaling pathway, may contribute to the phosphorylation of myosin II chains, thereby modulating their intracellular polymerization state. This cytoskeletal reorganization could enhance tumor cell motility and ultimately promote distant metastasis.
Nonetheless, this research is not without its limitations. Predominantly, given that the majority of patients with liver metastases originating from colorectal cancer have previously undergone radiotherapy and chemotherapy, the availability of eligible CRCM patient samples is markedly limited,the sample size impedes the ability to perform a statistical analysis of differences, potentially resulting in susceptibility to considerable random variations. Consequently, to bolster the robustness and applicability of our findings, further in-depth analytical studies and the augmentation of sample sizes to ascertain the statistical significance of differences have become focal points for our forthcoming research endeavors. Identifying novel therapeutic targets linked to these key genes and pathways may provide more efficacious treatment alternatives for metastatic colorectal cancer patients, constituting a vital trajectory for future investigations.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Morgan E Arnold M Gini A Lorenzoni V Cabasag CJ Laversanne M . Global burden of colorectal cancer in 2020 and 2040: incidence and mortality estimates from GLOBOCAN. Gut. (2023) 72:338–44. doi: 10.1136/gutjnl-2022-327736, PMID: 36604116 · doi ↗ · pubmed ↗
- 2Stewart CL Warner S Ito K Raoof M Wu GX Kessler J . Cytoreduction for colorectal metastases: liver, lung, peritoneum, lymph nodes, bone, brain. When does it palliate, prolong survival, and potentially cure? Curr Problems Surg. (2018) 55:330–79. doi: 10.1067/j.cpsurg.2018.08.004, PMID: 30526930 PMC 6422355 · doi ↗ · pubmed ↗
- 3Dienstmann R Vermeulen L Guinney J Kopetz S Tejpar S Tabernero J . Consensus molecular subtypes and the evolution of precision medicine in colorectal cancer. Nat Rev Cancer. (2017) 17:79–92. doi: 10.1038/nrc.2016.126, PMID: 28050011 · doi ↗ · pubmed ↗
- 4Sabanathan D Eslick GD Shannon J . Use of neoadjuvant chemotherapy plus molecular targeted therapy in colorectal liver metastases: A systematic review and meta-analysis. Clin Colorectal Cancer. (2016) 15:e 141–7. doi: 10.1016/j.clcc.2016.03.007, PMID: 27174607 · doi ↗ · pubmed ↗
- 5Crutcher M Waldman S . Biomarkers in the development of individualized treatment regimens for colorectal cancer. Front Med. (2022) 9. doi: 10.3389/fmed.2022.1062423, PMID: 36530921 PMC 9749006 · doi ↗ · pubmed ↗
- 6Patelli G Tosi F Amatu A Mauri G Curaba A PatanèDA . Strategies to tackle RAS-mutated metastatic colorectal cancer. ESMO Open. (2021) 6(3):100156. doi: 10.1016/j.esmoop.2021.100156, PMID: 34044286 PMC 8167159 · doi ↗ · pubmed ↗
- 7Xu T Wang X Xin Y Wang Z Gong J Zhang X . Trastuzumab combined with irinotecan in patients with HER 2-positive metastatic colorectal cancer: A phase II single-arm study and exploratory biomarker analysis. Cancer Res Treat. (2023) 55:626–35. doi: 10.4143/crt.2022.1058, PMID: 36550683 PMC 10101806 · doi ↗ · pubmed ↗
- 8Modest DP Laubender RP Stintzing S Giessen C Schulz C Haas M . Early tumor shrinkage in patients with metastatic colorectal cancer receiving first-line treatment with cetuximab combined with either CAPIRI or CAPOX: An analysis of the German AIO KRK 0104 trial. Acta Oncol. (2012) 52:956–62. doi: 10.3109/0284186 X.2012.752580, PMID: 23244709 · doi ↗ · pubmed ↗
