Identifying Molecular Determinants and Therapeutic Targets in Luminal B Breast Cancer: A Systems Biology Approach
Yousef Saeidi, Masoud Ghorbani, Ali Najafi, Mehrdad Moosazadeh Moghaddam

TL;DR
This study uses systems biology to identify key genes and pathways in Luminal B breast cancer, aiming to improve diagnosis and treatment.
Contribution
The study identifies novel hub genes, transcription factors, and miRNAs specific to Luminal B breast cancer using systems biology approaches.
Findings
Top hub genes in LBBC include FGF2, EGFR, and MET, among others.
Key transcription factors identified include RELA, PPARG, and CTCF.
Potential biomarkers include CDK1, CDK2, and MAPK3.
Abstract
Luminal B breast cancer (LBBC) is on the rise worldwide, with both incidence and mortality rates steadily increasing. Early detection proves difficult due to its aggressive characteristics, most notably its heightened proliferation rate and the complex interplay of molecular biomarkers, particularly in more advanced stages. Data Sources: In the present study, we conducted an in-silico analysis of LBBC cell lines using the Gene Expression Omnibus (GEO) microarray dataset, which includes 30 LBBC and 11 normal samples. Analysis Tools: Differentially expressed genes (DEGs) were identified using RStudio. A series of analyses, including cancer data interrogation via pan-cancer analysis, eXpression2Kinases (X2K), and the Cancer Dependency Map (DepMAP), was carried out to elucidate the underlying signaling pathways, transcription factors (TFs), and kinases, as well as to perform stemformatics…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods
Introduction
Breast cancer is the second most commonly diagnosed cancer worldwide and the leading cause of cancer-related mortality among women (1). According to the PAM50 gene expression profiling algorithm Luminal B (LumB) breast cancer, which makes up about 15–20% of cases, is a more aggressive subtype than Luminal A (Lum A), marked by hormone receptor expression and higher proliferation as indicated by Ki-67(2). This phenotype is associated with poorer clinical outcomes and reduced responsiveness to standard endocrine therapies, highlighting the need for novel therapeutic approaches (3, 4).
Advancements in high-throughput sequencing and omics technologies, such as genomics, transcriptomics, proteomics and metabolomics, have revolutionized cancer research by unveiling the molecular complexity of tumors (5, 6). These technologies have the ability to identify critical biomarkers for early detection, cancer stratification and personalized treatment approaches (7-9). In the context of LumB breast cancer, integrating multi-omics data provides a holistic view of tumor biology, revealing distinct molecular landscapes that drive tumor aggressiveness (10). Previous studies have elucidated specific genetic alterations in LumB tumors, notably TP53 mutations and heterogeneity in HER2 expression (11, 12). Additionally, transcriptomic analyses have provided insights into the roles of transcription factors, kinases and non-coding RNAs (ncRNAs), including microRNAs (miRNAs), in the pathogenesis of LumB breast cancer (13, 14).
Furthermore, cancer-associated fibroblasts (CAFs) within the tumor microenvironment (TME) are implicated in LumB tumor progression, therapy resistance and metastasis by modulating signaling pathways critical to proliferation, migration and angiogenesis (15-17). Despite these advances, many findings are fragmented and a systematic understanding of the molecular regulators driving LumB aggressiveness remains elusive.
To address these gaps, integrative bioinformatics approaches have emerged as powerful tools to synthesize diverse omics data. Protein-protein interaction (PPI) networks, transcriptional regulatory networks and epigenetic modifications provide insights into the interplay between genetic and non-genetic factors in tumor biology. Additionally, tools, such as Cancer Dependency Map (DepMap), and miRNA network analyses have advanced our understanding of tumor vulnerabilities, revealing potential therapeutic targets with high precision (18, 19).
This study employed a multi-omics framework to investigate LumB breast cancer at the molecular level. Using transcriptomic, epigenomic, and regulatory network analyses, we aimed to identify key molecular features and therapeutic targets that define LumB tumor biology. By integrating data from PPI networks, signaling pathway enrichment, cancer dependency maps, and miRNA-circRNA interactions, this study sought to uncover novel insights into tumor progression and therapeutic resistance. The findings had the potential to enhance diagnostic precision, stratify patients based on molecular characteristics, and inform the development of personalized therapeutic strategies, ultimately improving clinical outcomes for LumB breast cancer patients.
Materials and Methods
Gene Expression Analysis
Gene Expression Omnibus (GEO), a database for gene expression profiling and RNA methylation profiling maintained by the National Center for Biotechnology Information (NCBI), not only adheres to community-driven reporting standards, but also ensures the inclusion of several key study components such as raw data, processed data and descriptive metadata (20). Data for whole-transcriptome expression analysis of luminal B breast cancer (LBBC) were obtained from the NCBI GEO database (www.ncbi.nlm.nih.gov/geo). The gene expression profile dataset, with accession number GSE45827, was retrieved from GEO (platform: GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array), comprising 30 LBBC and 11 normal samples. Data expressed in this study were analyzed by R-studio (library GEOquery and Limma). *p-*values < 0.05 was considered to be statistically significant.
Protein-Protein Interaction (PPI) Network Analysis
The STRING database integrates all known and predicted associations among proteins, including both physical interactions and functional associations, through collecting and evaluating evidence from several sources (19). STRING is also a search tool for retrieval of interacting gene databases (www.string-db.org) which integrates both known and predicted protein-protein interactions (PPIs) that are used to predict functional interactions between DEGs (high confidence score 0.700 was set as the cut-off criteria to construct PPI network). Furthermore, the CytoHubba plugin of the Cytoscape software (www.cytoscape. org) was used to identify important genes.
Enrichment Analysis
WebGestalt 2024 marks a significant upgrade of the functional enrichment analysis platform (www.webgestalt.org). This update not only brings the database up to date but also enhances the tool's capabilities by incorporating support for metabolomics and introducing new pathways, networks and gene signatures (21)**. **A p-value of equal to/fewer than 0.05 was a significant boarder in gene ontology (GO) terms and pathways.
Cancer Dependency Map (DepMAP) Analysis
The DepMap database (www.depmap.org), building off of the original Cancer Cell Line Encyclopedia (CCLE) project, creates data and tools that can be used and shared by researchers (18). DepMap aids researchers in identifying genes that are crucial for the survival of cancer cells (genes with negative effects) and can be proposed as potential therapeutic targets. For example, genes with an effect score of less than -0.5 may represent promising candidates for the development of new therapeutic agents.
Cancer Data Analysis
The University of ALabama at Birmingham CANcer data analysis Portal (UALCAN) (www.ualcan.path.uab.edu) is an interactive web resource for analyzing cancer omics data. The UALCAN web portal provides access to publicly available cancer transcriptome data, primarily from The Cancer Genome Atlas (TCGA), allowing researchers to explore gene expression and its association with patient survival, tumor stage and other clinical features across multiple cancer types. The platform offers customizable plots and statistical analyses, facilitating the identification of potential biomarkers or therapeutic targets (22).
Stemformatics Analysis
Stemformatics (www.stemformatics.org) is an online platform designed to provide high-quality visualization and analysis of stem cell-related gene expression data. It hosts a curated collection of publicly available datasets from various platforms, including microarray and RNA-seq, with a focus on stem cell biology and differentiation processes. Stemformatics is particularly useful for bioinformatics-driven insights into stem cell biology, facilitating hypothesis generation and validation in regenerative medicine and developmental biology (23).
Detection of Transcription Factors (TFs) and Kinases
Transcription Factors (TFs) potentially regulating LBBC-related genes were identified using the ChIP Enrichment Analysis (ChEA) database, which provides data on eukaryotic TFs, binding motifs, experimentally validated binding regions and target genes (24). Moreover, eXpression2Kinases (X2K) was used to rank putative TFs, protein complexes and kinases likely driving the observed transcriptomic changes in LBBC.
Small RNA Analysis
miRNet (www.mirnet.ca) is a comprehensive web-based platform designed to facilitate the integrative analysis of microRNA (miRNA)-centric regulatory networks. The database aggregates a wide range of experimentally validated and predicted miRNA-target interactions, including miRNA-gene, miRNA-protein, miRNA-disease, etc. Top miRNAs targeting LBBC-related genes were selected and ranked based on P-value (P ≤ 0.05) (25).
Graphical Abstract of the Integrated Multi-Omics Workflow for Luminal-B Breast Cancer Analysis.
Results
Identification of Differentially Expressed Genes (DEGs)
The genes with different expressions were screened among the defined groups (30 LBBC and 11 normal samples). The Limma R package was used to identify DEGs. A p-value < 0.05 and |LogFC| > 2.0 were considered to be statistically significant. According to the statistics, a volcano plot is a scatter plot used to quickly spot changes in big data sets consisting of replicate data. Significance and fold-change are plotted on the y and x axes, respectively. Points of interest that display both large amplitude fold-changes (x axis) and high statistical significance are indicated by -log10 of p value, y axis. Those points with a fold-change less than two (log2 < 2) are represented in gray on this graph (Figure 1a).
The interactions of up and down-regulated genes were investigated by using the STRING database. Cytoscape software v 3.10.2 (cytoHubba plugin) analysis was carried out to identify hub genes (Figure 1b). The hub gene list was compiled by analyzing gene expression matrix data and drawing co-expression correlation coefficient heatmap. The co-expression study revealed the relationship among LBBC-associated genes.
Pan cancer Analysis
The methylation analysis of hub genes was conducted using the web-based UALCAN platform, identifying hub genes, followed by the construction of their network (Figure 2a, b). Boxplots indicate inverse relation between promoter methylation status and gene expression profile of LBBC in TCGA invasive breast cancer. Our results showed that the expression of genes in the data matrix promoter levels of methylation in 6 up genes, including fibroblast growth factor 1 (FGF1), proto-oncogene, receptor tyrosine kinase (MET), insulin like growth factor 1 (IGF1), peroxisome proliferator activated receptor gamma (PPARG), lipase E, hormone sensitive type (LIPE), epidermal growth factor receptor (EGFR) and 6 down genes, including cyclin dependent kinase 1 (CDK1), kinesin family member 11 (KIF11), hyaluronan mediated motility receptor (HMMR), PDZ binding kinase (PBK), cyclin A2 (CCNA2) and components of NDC80 kinetochore complex (NUF2) were found to be hub genes (Figure 2).
Co-expression heatmap analysis (a), Sample clustering based on t-SNE (b), Volcano plot of expression changes (c), Network of downregulated genes (d), Network of upregulated genes (e).
Methylation analysis to find hub genes (a), Up and down hub gene networks (b and c)
Analysis of Cancer-Related Gene Dependencies Using DepMap Data
This study analyzed the dependency of various cancer-related genes across multiple cell lines using data from the DepMap database. CRISPR-Cas9 and RNAi screening were employed to assess the essentiality of genes for cell survival. Each gene's dependency was quantified using gene effect scores, with negative scores indicating higher dependency. Key findings included genes like EGFR and MET, which were classified as highly selective and essential in a significant number of cancer cell lines, suggesting their potential as therapeutic targets. In contrast, genes like FGF1, FGF2 and EGF showed limited essentiality, indicating their less importance across these cell lines. PPARG and LIPE showed a mix of CRISPR and RNAi data, with some selectivity but not as broadly essential as EGFR (Figure 3).
Stemformatics Analysis
Stemformatics analysis was conducted to investigate the role of specific genes in cancer stem cells. Our analysis revealed the pivotal roles of EGFR, IGF1, PPARG, EGF, LIPE, ADIPOQ, MET, FGF1, and FGF7 in the elevated expression seen in fibroblasts. These genes are crucial in modulating key pathways that influence fibroblast activity, particularly in the tumor microenvironment. Their upregulation suggests a significant contribution to fibroblast-cancer cell interactions, promoting processes such as cell proliferation, migration and angiogenesis. These findings underscored the importance of fibroblast-related gene expression in cancer progression, highlighting potential therapeutic targets within the tumor stroma (Figure 4).
DepMap analysis of the dependency of tumor cell line panels in CRISPR (blue) and RNAi (violet) databases on the indicated hub genes.
Signaling Pathway Analysis
This study identified several keys signaling pathways significantly enriched in our dataset through the analysis of gene expression data. Regulation of Lipolysis in Adipocytes was found to be the most prominent pathway, showing the highest degree of enrichment. Moreover, the PPAR signaling pathway was notably enriched, suggesting its potential role in the regulation of metabolic processes linked to cancer progression. Further pathways of interest included the Melanoma pathway and the ECM-Receptor Interaction. The Focal Adhesion pathway, which is crucial for cell migration, invasion and processes that are often dysregulated in cancer cells, was found another significant finding. We also identified several oncogenic signaling cascades, such as PI3K-Akt, Ras, and Rap1 signaling pathways, all of which are frequently implicated in cancer development, cell proliferation and survival. Moreover, the AMPK signaling pathway, which plays a key role in cellular energy homeostasis, was highlighted as a significant pathway, potentially linking metabolic stress responses to cancer progression (Figure 5 a, b).
Stemformatic analysis to demonstrate increased expression of hub genes in cancer stem cells
Stemformatics Analysis
Gene expression analysis using the Stemformatics database revealed distinct expression patterns across different cell types. Notably, the fibroblast cells exhibited the highest expression levels for several key genes. Specifically, IGF1, MET, EGF, EGFR, PPARG and FGF7 showed significantly elevated expression in fibroblasts compared with other cell types, indicating a prominent role of these genes in fibroblast functions. The findings suggest that fibroblasts have a unique transcriptomic profile, in which these genes are upregulated, potentially implicating their importance in cellular differentiation and tissue repair processes (Figure 6).
Signaling pathways analysis of up- and down-regulated genes (a, b)
Identification of kinases and transcription factors (TFs)
X2K was used to identify the key TFs, kinases, and intermediary proteins involved in the regulation of gene expression. Our results revealed that RELA, PPARG, EGR1, NFE2L2, and TP63 were the most significant TFs targeting the greatest number of genes associated with LBBC. Among 10 TFs, RELA and PPARG showed the most interactions with intermediate proteins and kinases (Figure 7).
miRNA Network Analysis
We identified microRNA (miRNA) networks for the genes FGF7, FGF2, IGF1, ADIPOQ, EGF, FGF1, LIPE, MET and PPARG using the miRNet database. In addition, miRNA networks were detected for FGF2, MET and PPARG, indicating potential regulatory interactions involving these miRNAs. These findings highlighted the intricate post-transcriptional regulation of these genes through both microRNAs (Figure 8).
The interaction of transcription factors (TFs; red spots) and kinases (blue spots) with hub genes.
Identification of the key miRNAs and genes involved in LumB.
Discussion
The present study could successfully identify the main molecular regulators of LumB breast cancer (LBBC) through transcriptomic analysis, opening a potential window for new therapeutic targets. Indeed, our analysis focused specifically on key entities, including hub genes, TFs, miRNAs, kinases, CAFs and PPIs.
The dataset (GSE45827) analyzed in our study contained 2,206 downregulated and 945 upregulated genes (from 11 normal and 30 tumor samples). The DEG analysis revealed 18 hub genes with significant prognostic potential, among which 10 were upregulated genes (including FGF2, EGFR, ADIPOQ, LIPE, MET, IGF1, FGF1, EGF, FGF7 and PPARG but 8 were downregulated genes (including CDK1, KIF11, BUB1, HMMR, PBK, CCNA2, CDC20 and NUF2. The discovery of such genes, despite significant advances in understanding the molecular causes of LumB breast cancer, may help identify potential targets for the development of novel therapeutic agents.
There are studies showing altered methylation patterns of several hub genes in cancer, reflecting their key roles in tumor growth and survival. FGF1 and FGF2 are involved in cell proliferation and angiogenesis (26), while EGFR regulates crucial signaling pathways such as MAPK and PI3K-Akt, making it a promising therapeutic target (27, 28). In addition, MET was demonstrated to play a role in metastasis and survival, with its methylation changes suggesting disrupted regulation (29).
The upregulation of metabolic genes, such as ADIPOQ, LIPE, IGF1 and PPARG, highlights the tumor’s metabolic reprogramming to drive in nutrient-rich environments (30). Our results showed that the epigenetically-modified methylation can serve as a dual regulator so as to not only facilitates the DNA self-assembly process but may also be used as a universal biomarker for cancer (31-34). Moreover, our investigation of the strongly selective dependencies identified for MET, EGFR, and PPARG reinforce their established roles as oncogenes or key modulators in cancer biology (33).
Results from DepMap highlights the importance of integrating CRISPR and RNAi datasets for a comprehensive understanding of gene essentiality in cancer (35). Our results showed the dependency patterns of FGF7, ADIPOQ and LIPE underscore emerging vulnerabilities that may be context-specific, offering potential new therapeutic avenue (36, 37). Additionally, the lack of dependency for genes such as FGF1, FGF2 and EGF suggests functional redundancy or limited cancer-specific roles, emphasizing the necessity of multi-gene pathway analysis.
Stemformatics analysis revealed the elevated expression of MET, EGFR, IGF, FGF1 and FGF7 highlights their importance in CAFs are the principal population of stromal cells in lumB tumors (38-40). Our results demonstrated that PPARG and ADIPOQ, whose central roles were previously discovered in CAFs, also exhibit increased expression in fibroblasts. Because EGF has the highest expression level in iPSCs, the identification of such potential genes may help develop novel therapeutic strategies.
According to this study, Rap1 and Ras pathways are essential for controlling cell adhesion and migration, and they can affect these functions by sharing adhesion molecules or downstream kinases. Interactions between Rap1 and Ras may have a synergistic effect on cell invasion (41, 42). Simultaneous targeting of Rap1 and Ras pathways can serve as an effective strategy to decrease cancer cell invasion and metastasis (Future experimental studies are needed to validate these pathways as actionable targets. Additionally, this study is based on in-silico analyses, and clinical validation is required to confirm the relevance of identified genes, pathways, and biomarkers in patient populations).
Importantly, studies focusing on transcription factors (TFs) and kinases provide critical insights into the regulation of downstream signaling networks such as Rap1 and Ras pathways. Key transcription factors, including TP53, RELA, and PPARG, orchestrate diverse cellular processes ranging from apoptosis and inflammation to metabolic regulation, while kinases such as CDK1, CDK2, and MAPK14 act as pivotal modulators that phosphorylate these TFs to control their activity. These phosphorylation events are essential for proper cell cycle progression, DNA damage response, and stress signaling, thereby maintaining cellular homeostasis. Dysregulation of these TF–kinase networks can contribute to oncogenic transformation, uncontrolled proliferation, and enhanced metastatic potential. In the context of LBBC, integrating the activity patterns of these TFs and kinases may help pinpoint molecular vulnerabilities that could be exploited for targeted therapeutic interventions (43).
Recent proteogenomic and multi-omics studies have shown how integrating different molecular layers can refined cancer subtypes and uncover therapeutic targets (44-46). In our LBBC analysis, key regulators such as MET, EGFR, PPARG, and kinases including CDK1, CDK2, and MAPK14 overlapped with pathways linked to genomic instability and WGD-associated dependencies, particularly Ras/Rap1 and MYC/E2F signaling (47). These findings suggest that LBBC may share vulnerabilities with proteogenomic-defined cancer subtypes. To translate these results, further validation across independent datasets, survival analyses, and drug–gene interaction screening will be important. Moreover, extending future studies beyond coding changes to include noncoding variants using WGS and population-scale resources (48) will provide a more comprehensive view of LBBC biology.
Lastly, the intricate regulatory network involving miRNAs emphasizes their roles in modulating key genes such as MET, FGF2, and PPARG. Our research showed that hsa-mir-221-3p and hsa-mir-29a-3p play crucial roles in tumorigenesis and angiogenesis (49, 50). Such miRNAs target critical components of signaling pathways involved in cancer, metabolism and cellular differentiation.
Conclusion
In conclusion, findings from this study shed light on the critical factors associated with the development of LumB, paving the way for improved therapeutic interventions. Of great note, integrating transcriptomics data can result in developing novel treatment approaches for patients with breast cancer, potentially enhancing early detection and timely management of aggressive tumors.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Bray F Laversanne M Sung H Ferlay J Siegel RL Soerjomataram I Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries CA Cancer J Clin 2024743229633857275110.3322/caac.21834 · doi ↗ · pubmed ↗
- 2Cheang MC Chia SK Voduc D Gao D Leung S Snider J Ki 67 index, HER 2 status, and prognosis of patients with luminal B breast cancer J Natl Cancer Inst 200910110736501943603810.1093/jnci/djp 082PMC 2684553 · doi ↗ · pubmed ↗
- 3Nancy YY Iftimi A Yau C Tobin N Pvan't Veer L Hoadley KA Assessment of long-term distant recurrence-free survival associated with tamoxifen therapy in postmenopausal patients with luminal A or luminal B breast cancer JAMA oncology 201959130493139351810.1001/jamaoncol.2019.1856 PMC 6692699 · doi ↗ · pubmed ↗
- 4Ades F Zardavas D Bozovic-Spasojevic I Pugliano L Fumagalli Dde Azambuja E Luminal B breast cancer: molecular characterization, clinical management, and future perspectives J Clin Oncol 2014322527948032504933210.1200/JCO.2013.54.1870 · doi ↗ · pubmed ↗
- 5Heo YJ Hwa C Lee GH Park JM An JY Integrative Multi-Omics Approaches in Cancer Research: From Biological Networks to Clinical Subtypes Mol Cells 2021447433433423876610.14348/molcells.2021.0042 PMC 8334347 · doi ↗ · pubmed ↗
- 6Vitorino R Transforming Clinical Research: The Power of High-Throughput Omics Integration Proteomes 2024123253931119810.3390/proteomes 12030025 PMC 11417901 · doi ↗ · pubmed ↗
- 7Passaro A Al Bakir M Hamilton EG Diehn M Andre F Roy-Chowdhuri S Cancer biomarkers: Emerging trends and clinical implications for personalized treatment Cell 202418771617353855261010.1016/j.cell.2024.02.041PMC 7616034 · doi ↗ · pubmed ↗
- 8Massard C Michiels S Ferte C Le Deley MC Lacroix L Hollebecque A High-Throughput Genomics and Clinical Outcome in Hard-to-Treat Advanced Cancers: Results of the MOSCATO 01 Trial Cancer Discov 201776586952836564410.1158/2159-8290.CD-16-1396 · doi ↗ · pubmed ↗
