Unraveling the Molecular Mechanisms of Benzo(a)pyrene (BaP)-Induced Ovarian-Related Disorders: Integrating Computational Predictions and Experimental Validation
Mengwei Ma, Tao Qi, Yuqiang Lin, Haiyan He, Haotian Lei, Rufei Gao, Fei Han, Taihang Liu, Hanting Xu, Xuemei Chen

TL;DR
This study explores how the pollutant Benzo(a)pyrene affects ovarian health by combining computational models and experiments, identifying key genes involved in reproductive disorders.
Contribution
The novel integration of network toxicology, molecular simulations, and experimental validation reveals shared mechanisms of BaP-induced ovarian disorders.
Findings
BaP shows strong binding affinity with EGFR, ESR1, and STAT3, identified as critical targets for ovarian dysfunction.
BaP downregulates EGFR and ESR1 but upregulates STAT3 in KGN cells, confirming computational predictions.
Shared toxicological pathways suggest common mechanisms for BaP-induced LPD and POF.
Abstract
The ovaries are crucial reproductive organs that regulate the menstrual cycle and support pregnancy through the production of steroid hormones. They are highly susceptible to various environmental pollutants, which can lead to ovarian disorders. Luteal phase defect (LPD) and premature ovarian failure (POF) are common ovarian disorders in women. In this study, we integrate network toxicology with molecular docking and molecular dynamics simulations to elucidate the toxicological mechanisms of Benzo(a)pyrene (BaP), a widespread endocrine disruptor, in LPD and POF. Through systematic data mining of the GeneCards and OMIM databases, we identified 1336 targets associated with LPD and 2066 targets related to POF, as well as 220 BaP targets. Venn diagram analysis revealed 36 potential targets for BaP-induced LPD and 43 for BaP-induced POF. GO and KEGG enrichment analyses suggest that…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8- —National Natural Science Foundation of China
- —Natural Science Foundation of Chongqing
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEffects and risks of endocrine disrupting chemicals · Cytokine Signaling Pathways and Interactions · Reproductive System and Pregnancy
1. Introduction
The ovaries, as the primary reproductive organs in females, play a crucial role in synthesizing and secreting steroid hormones such as estrogen (E2) and progesterone (P4) [1]. These hormones regulate the menstrual cycle, support ovarian follicle development, and maintain reproductive health. Proper secretion of these hormones is essential for ovulation, successful pregnancy progression, and overall female reproductive health [2]. Disorders affecting ovarian function contribute significantly to the global burden of gynecological diseases. In 2021, 1.53 billion cases of non-neoplastic gynecological diseases were reported worldwide—a 55.3% increase since 1990—and women aged 20–49 years bore 68% of the associated disability-adjusted life years [3]. Among the most prevalent ovarian disorders are luteal phase defect (LPD) and premature ovarian failure (POF), both characterized by insufficient secretion of steroid hormones, leading to menstrual irregularities, infertility, and recurrent miscarriage [4,5,6]. Clinically, ovarian insufficiency accounts for approximately 35% of early pregnancy losses and 4% of recurrent pregnancy losses [7], further emphasizing the critical importance of ovarian function during pregnancy. Studies have shown that prenatal exposure to environmental pollutants, such as ethephon and benzo(a)pyrene (BaP), can induce LPD-like phenotypes in animal models. These phenotypes are characterized by decreased serum P4 levels, impaired endometrial receptivity and decidualization, lower embryo implantation rates, and abnormal development of the placenta and embryo [8,9,10]. Moreover, Tran DN et al. reported that combined exposure to phthalates and 4-vinylcyclohexene diepoxide accelerates follicular depletion, ultimately leading to POF in rats [11]. Among the various contributing factors to LPD and POF, increasing attention has been directed toward environmental causes, particularly endocrine-disrupting pollutants.
Polycyclic aromatic hydrocarbons (PAHs) constitute a class of air pollutants that are of global concern due to their toxicity, persistence, and propensity for environmental dispersion. These compounds can enter the environment via multiple pathways, including exhaust emissions [12], wastewater discharge, and sediment contamination [13]. PAHs comprise various hazardous chemicals, among which BaP is particularly notable as it is recognized as one of the most potent carcinogens within this group [14]. Research has demonstrated that BaP possesses carcinogenic, teratogenic (causing birth defects), mutagenic (inducing genetic mutations), and reproductive toxicity properties [15]. Human exposure to BaP occurs predominantly through smoking, ingestion of contaminated food, and dermal absorption, with dietary intake constituting the principal exposure pathway in non-smoking individuals [16].
Epidemiological evidence strongly links BaP exposure to adverse pregnancy outcomes. A nested case–control study within a Chinese birth cohort reported that maternal BaP-DNA adduct levels were associated with increased preterm birth risk (OR = 1.27 per interquartile range increase, 95% CI: 0.95–1.67), with high-level exposure conferring more than twofold risk compared to low-level exposure (OR = 2.05, 95% CI: 1.05–4.01) [17]. Additional studies have linked prenatal PAH exposure to reduced fetal growth, low birth weight, and altered placental function [18,19]. Our previous research has confirmed that BaP exposure exerts toxic effects on ovarian function, leading to adverse pregnancy outcomes [20]. However, the literature addressing BaP’s role in ovarian-related diseases such as LPD and POF remains limited, and the underlying mechanisms require elucidation. Previous studies have indicated associations between BaP exposure and these disorders [21]. Given escalating environmental pollution, investigating the relationship between BaP and ovarian diseases is imperative.
Environmental pollutants often induce intricate toxicological responses by perturbing diverse biomolecules. Traditional toxicological methods struggle to comprehensively assess interrelationships among various toxicities. Network toxicology addresses these challenges by integrating bioinformatics and big data, constructing associative networks among compounds, toxicity targets, and diseases, providing rapid and comprehensive toxicological insights [22]. By leveraging multiple databases, a “compound-target-disease” network model is constructed to investigate the toxicological characteristics of targets. This method bypasses traditional toxicological experiments, offering simplicity, accuracy, and broad applicability [23]. The objective of network toxicology is to decipher the molecular interactions within biological systems and to determine how these interactions collectively contribute to the manifestation of toxicity. Additionally, molecular docking enables atomic-level simulation of the binding interactions between chemicals such as BaP and protein targets, thereby revealing the potential molecular pathways through which these substances may promote disease mechanisms [24]. Molecular docking can predict and clarify toxin-biomolecule interactions, uncovering the underlying mechanisms of toxicity and their consequent adverse effects on biological organisms. Molecular dynamics simulation is employed to assess the stability and dynamic behavior of the toxin-protein complexes over time [25], providing deeper insights into their binding affinities and conformational changes. A recent study has delineated the toxicity mechanisms of the pesticide Thiabendazole and elucidated its molecular hepatotoxicity using this approach [23]. Additionally, Pei et al. have employed this method to clarify the impact of endocrine disruptors on kidney damage [26].
In this study, we integrated network toxicology, molecular docking, and molecular dynamics simulation techniques to explore the potential toxic targets and associated mechanisms of BaP-induced LPD and POF. Additionally, the identified core targets were validated in cell models. Our study aims to facilitate the effective application of network toxicology in identifying toxicological targets and elucidating the molecular mechanisms of action of environmental pollutants, while establishing experimental evidence for future clinical prevention and therapeutic strategies.
2. Result
2.1. Chemical Properties and Toxicological Assessment of BaP
The chemical characteristics of BaP, including its formulas, molecular weight, and SMILES structure, were provided in Supplementary File S1. In total, 220 BaP targets were identified using the Super-PRED, ChEMBL, and SEA websites. The PPI protein interaction networks of BaP target genes were constructed using the String database and visualized with Cytoscape software 3.10 (Supplementary Figure S2). The PPI results indicated that the core targets of BaP primarily included EGFR, ESR1, STAT3, MMP9, ERBB2, and NFKB1.
2.2. Potential Targets of BaP Induce LPD and POF
Based on literature reviews and our previous research, we have identified that BaP exhibited female reproductive toxicity, particularly by impairing ovarian function. Therefore, we focused specifically on ovarian-related diseases, namely LPD and POF. First, a total of 1336 LPD-related targets and 2066 POF-related targets were retrieved from the GeneCards and OMIM databases. Next, through Venn diagram analysis, we obtained 36 targets for BaP-induced LPD and 43 targets for BaP-induced POF, respectively (Figure 1A,B). These targets were then imported into Cytoscape to construct compound regulatory networks. Figure 1C illustrates the network of targets associated with BaP-induced LPD, while Figure 1D presents the network of targets related to BaP-induced POF.
2.3. GO and KEGG Pathway Analysis for the Targets Associated with BaP-Induced LPD and POF
We subsequently conducted Gene Ontology (GO) enrichment analysis for the target genes associated with BaP-induced LPD and POF. The results revealed that the LPD-related target genes were predominantly involved in biological processes such as responses to xenobiotic stimuli, rhythmic processes, and intracellular calcium ion homeostasis (Figure 2A). In contrast, the POF-related target genes were enriched in pathways related to xenobiotic stimuli, inflammatory regulation, and mechanical stimuli (Figure 2B). Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses demonstrated that both BaP-induced LPD and POF target genes were significantly enriched in pathways including chemical carcinogenesis–receptor activation, prolactin signaling pathway, chemokine signaling pathway, and Th17 cell differentiation (Figure 2C,D). Comparative analysis highlighted overlapping biological functions and signaling pathways between BaP-induced LPD and POF, suggesting that these shared mechanisms play a critical role in both conditions.
2.4. Construction of the PPI Network and Identification of Hub Targets
Figure 3A illustrates the protein interaction network for BaP-induced LPD, while Figure 3B depicts the network for BaP-induced POF. For BaP-induced LPD, the mean degree value was 8; proteins with a degree ≥ 16 were designated as hub targets. For BaP-induced POF, the mean degree value was 10; proteins with a degree ≥ 20 were selected as hub targets. This process yielded four hub targets (EGFR, ESR1, STAT3, and MMP9) for BaP-induced LPD, and four hub targets (EGFR, ESR1, STAT3, and NFKB1) for BaP-induced POF. As illustrated in Figure 3C,D, the central small red circles represented the hub target, while the larger outer circles denoted the remaining targets. The color intensity reflected the degree value, with deeper red indicating higher connectivity. Notably, the PPI network visualization revealed a partial overlap between the targets of BaP-induced LPD and POF. Among these signature genes, EGFR, ESR1, and STAT3 emerged as core targets, suggesting their potential involvement in the pathogenesis of multiple diseases. To further investigate this overlap, we utilized a Venn diagram to identify the common targets associated with BaP-induced LPD and POF, resulting in a total of 26 targets, as shown in Figure 4A. Further selection based on degree values highlighted EGFR, ESR1, and STAT3 as key common targets for BaP-induced LPD and POF (Figure 4B).
2.5. Enrichment Analysis of Common Targets Associated with BaP-Induced LPD and POF
Enrichment analysis was performed to explore the potential biological functions and underlying mechanisms of the core targets. The results indicated that these core targets were predominantly associated with miRNA transcription and metabolic process, response to estradiol, and cell proliferation (Figure 5A). Additionally, we identified several key pathways, including proteoglycans in cancer, chemical carcinogenesis–receptor activation, prolactin signaling pathway, FoxO signaling pathway, and estrogen signaling pathway (Figure 5B). These findings imply that receptor activation, estradiol response, and cell proliferation may play crucial roles in the induction of LPD and POF by BaP.
2.6. Molecular Docking Validation and Binding Mode Analysis of BaP with Core Targets
Prior to docking BaP with the core targets, the docking protocol was validated by redocking the co-crystallized ligands into their respective binding sites. As shown in Supplementary Table S3, the RMSD values between the redocked poses and the original experimental conformations were below the acceptable threshold of 3.0 Å for all three targets, confirming that the adopted grid parameters and docking settings reliably reproduce native binding modes. To elucidate the potential binding modes of BaP with the core targets, blind molecular docking was performed using AutoDockTools. The detailed grid parameters used for each protein are provided in Supplementary Table S4. As shown in Figure 6 and Supplementary Table S5, BaP interacted with EGFR primarily through hydrophobic interactions with residues ALA-719, LEU-694, LEU-768, MET-769, THR-766, and LEU-820, yielding a binding energy of −7.33 kcal·mol^−1^. Similarly, BaP engaged in hydrophobic interactions with ESR1 at residues LEU-387, LEU-391, ILE-424, LEU-346, LEU-349, and LEU-525, along with π-stacking interactions with PHE-404, resulting in a binding energy of −9.12 kcal·mol^−1^. Additionally, BaP interacted with STAT3 via hydrophobic interactions at LEU-532, LYS-548, and ALA-555 of STAT3, as well as π-stacking with LYS-548, yielding a binding energy of −6.44 kcal·mol^−1^. All binding energies were below −5 kcal·mol^−1^, indicating stable binding between BaP and these targets.
2.7. Molecular Dynamics Simulations of BaP-Target Complexes
To further assess the stability of these interactions, molecular dynamics (MD) simulations were performed. Root-mean-square deviation (RMSD) is a valuable metric for evaluating protein–ligand complex stability, quantifying atomic positional changes relative to the initial positions. In molecular dynamics simulations, a system is considered to have reached equilibrium when RMSD values stabilize and fluctuate around a constant value after an initial increase [27]. For protein–ligand complexes, RMSD values < 5 Å are generally indicative of high-quality models with stable conformations, while values between 5 and 10 Å suggest moderate stability with some conformational flexibility; values exceeding 12 Å typically reflect unstable or poorly resolved interactions [28]. In docking validation, an RMSD ≤ 2.0 Å between predicted and experimental binding poses is conventionally regarded as a successful docking conformation, achieving success rates of approximately 91% [29]. However, for assessing overall complex stability throughout MD trajectories, slightly higher RMSD values may still represent stable interactions depending on protein flexibility and simulation conditions. As depicted in Figure 7A, the ESR1-BaP and STAT3-BaP complex systems reached equilibrium after approximately 10 ns, with RMSD values stabilizing around 2.9 Å and 2 Å, respectively. These values fall well within the high-quality model range [28] and are consistent with reported RMSD values (2.16–2.82 Å) for stable protein–ligand complexes [30]. The EGFR-BaP complex system achieved equilibrium after approximately 80 ns, exhibiting RMSD fluctuations around 10 Å, which falls within the medium-quality model range (5–10 Å) [28]. While this value is higher than those observed for ESR1-BaP and STAT3-BaP, it remains within the acceptable range for moderately stable complexes and likely reflects the intrinsic conformational flexibility of the EGFR kinase domain, which is known to undergo structural rearrangements upon ligand binding. Further analysis revealed that the radius of gyration (Rg) and solvent-accessible surface area (SASA) of all three complexes underwent minimal changes throughout the simulation (Figure 7B,C), suggesting that BaP binding did not induce major perturbations in protein folding or compactness. Root-mean-square fluctuation (RMSF) analysis (Figure 7D) showed that the ESR1-BaP, EGFR-BaP, and STAT3-BaP systems exhibited relatively low RMSF values (predominantly below 4 Å), indicating reduced residue mobility and enhanced local stability upon complex formation. Collectively, these results indicate that the ESR1-BaP, EGFR-BaP, and STAT3-BaP systems establish stable interactions. Therefore, BaP exhibits strong binding affinity toward the target proteins ESR1, STAT3, and EGFR. These findings advise that BaP may cause LPD and POF by interacting with these targets. However, further experimental validation is required.
2.8. Validation of Core Target Interactions with BaP in a Cellular Model
Our previous studies have demonstrated that exposure to BaP leads to a reduction in ovarian hormone levels, primarily E2 and P4, which mirrors the symptoms observed in diseases such as LPD and POF. To investigate whether core proteins EGFR, ESR1, and STAT3 play critical roles in BaP-induced LPD and POF pathogenesis, we established an in vitro model of BaP exposure for validation. The cell model exhibited steroid hormone alterations that were consistent with the changes observed in the animal model [10,31,32]. Treatment with BaP significantly suppressed the expression of 3β-HSD, 17β-HSD, and P450scc in luteinized KGN cells (Supplementary Figure S2), thereby reducing the synthesis and secretion of E2 and P4 (Figure 8A), confirming the successful establishment of the cell model. Furthermore, the results revealed that BaP treatment decreased the protein expression levels of the core targets EGFR and ESR1 while simultaneously increasing the protein levels of STAT3 (Figure 8B,C). These findings suggest that BaP interacts with core targets, modulating their function, and provide evidence that EGFR, ESR1, and STAT3 may be involved in the pathogenesis of LPD and POF.
3. Discussion
LPD and POF have multifactorial causes, among which environmental factors represent a significant contributor. BaP, a PAH, is recognized as a significant environmental pollutant that exerts various toxic effects on living organisms [33]. Although its toxic effects have been extensively studied, the specific ovarian toxicity and underlying mechanisms remain poorly understood. This research presents an in-depth examination of the potential adverse effects of BaP through the lens of multiple diseases, leveraging data from toxicology databases and literature reviews. The findings indicate that BaP induces LPD and POF through multiple potential toxic targets. By integrating network toxicology, molecular docking, and MD simulation techniques, we systematically identify the interactions between BaP and key molecular targets, thereby elucidating the molecular mechanisms by which BaP impairs reproductive system function. These results not only enhance our understanding of the toxicological mechanisms of BaP but also highlight its potential risks for adverse pregnancy outcomes, ovulation, and overall female reproductive health.
In the study of environmental pollutant toxicity, traditional toxicological approaches predominantly depend on experimental methods such as animal testing and cell culture. Although these methods are indispensable in toxicological research, they also exhibit notable limitations. Toxicological responses are often intricate, involving multiple biological targets and signaling pathways. However, traditional methods tend to concentrate on responses related to single targets or specific doses, which limits the comprehensive understanding of compound interactions within biological systems. This narrow focus hinders the discovery of potential toxic mechanisms and long-term effects. Moreover, the considerable time, costs, and ethical concerns associated with experimental procedures present challenges to traditional toxicological research [24].
In contrast, emerging approaches such as network toxicology integrate bioinformatics and big data analysis to encompass chemical properties, biological targets, toxic mechanisms, and disease associations, thereby constructing a more comprehensive and systematic “compound–target–disease” network [23]. Moreover, network toxicology enhances toxicity assessment by offering more precise predictive capabilities through advanced big data analysis and bioinformatics tools [34]. This not only expedites research but also reduces reliance on animal testing, aligning with contemporary ethical standards. In this research, potential targets for BaP, LPD, and POF were identified through comprehensive database queries. We utilized Venn analysis to pinpoint the key targets involved in BaP-induced LPD and POF. Further investigation of these core targets revealed shared common regulatory mechanisms and target genes between BaP-induced LPD and POF. Additionally, our enrichment analysis of core genes indicated that BaP can trigger various diseases, including pancreatic cancer, non-small cell lung cancer, and Hepatitis C, via common target genes. Network toxicology analysis not only identified multiple key targets but also facilitated the simultaneous targeting of different diseases through shared molecular mechanisms. This approach goes beyond the traditional toxicological concentration on individual targets by incorporating multiple targets and their interactions, thereby enhancing our understanding of the diverse biological effects induced by toxic substances.
A comprehensive review of the literature indicates that exposure to BaP is associated with various adverse health effects, and its carcinogenicity has been confirmed by multiple epidemiological studies [35,36,37]. Research demonstrates a strong link between BaP exposure and the development of various tumors, including lung and breast cancer [35,37]. Our study identified several biological functions and pathways associated with cancer progression. Additionally, growing evidences suggest that BaP may adversely impact reproductive health. For instance, study has shown that BaP exposure can lead to ovarian dysfunction and reduced fertility in female animals, highlighting potential reproductive toxicity [9]. Specifically, long-term exposure to BaP has been shown to impair ovarian steroidogenesis and disrupt hormonal regulation, thereby negatively impacting female reproductive health. These findings underscore the urgent need for stringent regulations to assess environmental pollutants and their impact on reproductive health. LPD and POF are prevalent ovarian disorders that impose significant physical and psychological burdens on women of childbearing age due to the demanding treatment processes and associated costs [6,38]. Current research predominantly focuses on the etiology of individual diseases, often overlooking the influence of environmental pollutants on disease onset and progression. Additionally, there is a lack of comprehensive analysis regarding the mechanistic pathways by which environmental pollutants impact a spectrum of related diseases. Our study employs network toxicology methods to investigate potential toxicological targets and mechanisms associated with BaP-induced LPD and POF, thereby advancing this field of research. We have identified several key targets, including EGFR, ESR1, and STAT3, which exhibit high centrality within the target network and may play crucial roles in BaP-induced ovarian toxicity. Furthermore, we elucidate the molecular pathways through which BaP exerts its toxic effects on ovarian function. These findings not only contribute to the existing literature but also emphasize the necessity for a deeper understanding of the environmental and health impacts of BaP exposure.
EGFR is a transmembrane glycoprotein that plays a critical role in various cellular processes, including proliferation, survival, differentiation, migration, and inflammation [39]. Increasing evidence links EGFR to multiple types of cancers, including ovarian cancer, glioblastoma, and lung cancer. Consequently, targeting EGFR represents a promising therapeutic strategy for cancer treatment [40]. EGFR has emerged as a viable target for diverse clinical applications. A recent study has shown that high expression of microRNA-194 in the granulosa cells of PCOS patients can induce KGN cell apoptosis by directly targeting heparin-binding EGF-like growth factor [41]. Another study demonstrated that toad venom inhibits the proliferation of ovarian cancer cells via the EGFR/AKT/ERK signaling pathway [42]. Moreover, EGFR ligands influence luteal function. Specifically, amphiregulin (AREG) regulates steroid hormone production through the EGFR/STAR pathway, thereby controlling luteal activity [43].
Estradiol and estrogen receptor alpha (ERα, also known as Esr1) are essential for maintaining normal ovarian function and development in mammals, and animals lacking Esr1 exhibit infertility [44,45,46]. A recent study indicates that ovaries with Esr1 deficiency display signs similar to ovarian aging, characterized by significant iron accumulation [47]. Additionally, differentially expressed genes (DEGs) in stromal cells between young and aged ovaries are regulated by ESR1 and AR. This regulation may explain the expression changes in stromal cells related to ovarian aging, influenced by variations in steroid hormones and/or hormone receptors [48]. STAT3 is a member of the STAT family of transcription factors that mediates cellular responses to cytokines and growth factors [49]. Persistent activation of STAT3 can enhance cell proliferation, survival, invasion, cancer stem cell-like characteristics, angiogenesis, and drug resistance in ovarian cancer [50]. Furthermore, the JAK/STAT3 signaling pathway, which is activated downstream of IL-6, is crucial for the proliferation of ovarian stem cells [51]. The JAK-STAT signaling mediates various downstream events, including hematopoiesis, immune response, inflammation, and lipogenesis [52]. Additionally, combined inhibition of the EGFR and JAK/STAT3 pathways has been shown to exhibit a synergistic effect against human ovarian cancer [49]. However, the relationships between EGFR, STAT3, and ESR1 and the development of LPD and POF remain incompletely understood. Based on previous studies and our findings, we hypothesize that BaP may disrupt ovarian hormone synthesis and promote ovarian cell aging through the dysregulation of these key signaling molecules. Notably, our data suggest that BaP downregulates EGFR and ESR1 while upregulating STAT3, which may collectively contribute to impaired cell proliferation and hormone production. Therefore, BaP-induced inhibition of ovarian cell proliferation may be mechanistically linked to its effects on EGFR, ESR1, and STAT3 expression. Further studies are needed to elucidate their precise roles in the pathogenesis of BaP-induced LPD and POF.
4. Materials and Methods
4.1. Acquisition of BaP Targets
First, “BaP” was queried in the PubChem database to identify the most accurate match and to verify the correctness of its chemical name and molecular formula. Subsequently, the 3D structure (SDF format) and the SMILES representation of BaP were downloaded, and its CAS registry number was confirmed as 50-32-8. The corresponding SDF file was subsequently uploaded to the Super-PRED platform (developed by Charité—Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany; available at https://prediction.charite.de/subpages/target_prediction.php; accessed on 8 June 2025) for target prediction, and the resulting data were then downloaded for further analysis. Next, the SDF file of BaP was uploaded to both the ChEMBL website [developed by the European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK; available at https://www.ebi.ac.uk/chembl/; accessed on 8 June 2025] and the Similarity Ensemble Approach website [SEA, developed by the Shoichet Laboratory, University of California, San Francisco (UCSF), 1700 4th Street, San Francisco, CA 94158, USA; available at https://sea.bkslab.org/; accessed on 8 June 2025]. For this study, Homo sapiens was selected as the target species, and the prediction results were submitted and downloaded from both platforms. Finally, all target UniProt IDs were converted to common names using the UniProt website. The retrieved results represented the putative targets of BaP. Prediction outputs from the three platforms were integrated, and redundant entries were eliminated to construct a refined BaP target library.
4.2. Target Construction of Diseases
Disease target libraries were primarily constructed using the GeneCards [developed by Weizmann Institute of Science, 234 Herzl Street, Rehovot, 7610001, Israel; version 5.26 (updated 10 November 2025); available at http://www.genecards.org/; accessed on 8 June 2025] and OMIM (eveloped by Johns Hopkins University, 733 N. Broadway, Baltimore, MD 21205, USA; available at https://www.omim.org/; accessed on 8 June 2025) databases. Briefly, the keywords “luteal phase dysfunction” and “premature ovarian failure” were employed to identify targets related to LPD and POF in these databases. The predicted targets from both databases were then combined, duplicates were removed, and the remaining targets were retained to establish the disease-related target library.
4.3. Target Overlap Analysis and Construction of the Compound Regulatory Network
The intersection of targets associated with LPD and POF was then identified in relation to BaP targets, highlighting them as potential targets for BaP-induced LPD and POF. Using R (version 4.3.1), we constructed a Venn diagram to identify the intersecting genes, generated the corresponding network relationship file, and a node attribute file. These files were subsequently imported into Cytoscape 3.10.1 to construct the Compound Regulatory Network. The scripts used for these analyses are provided in Supplementary File S1.
4.4. Functional Enrichment Analysis Based on GO and KEGG Pathways
GO enrichment analysis was conducted using the DAVID database. The analysis included three GO categories: biological processes (BP), molecular functions (MF), and cellular components (CC). At the same time, KEGG pathway analysis was also performed to identify relevant signaling pathways. The overlapping targets identified via Venn diagram analysis were uploaded to the DAVID platform, with Homo sapiens specified as the reference species. Results were then analyzed and submitted. Based on the ascending order of p-value, the top 10 GO catalog and the top 20 KEGG catalog were selected for further analysis. Visualization of GO and KEGG results was performed using R (version 4.3.1). The scripts used for these analyses are provided in Supplementary File S1.
4.5. Construction of the PPI Network
The PPI network was established using the STRING database (developed by CPR—Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen, Denmark; version 12.0; available at https://cn.string-db.org/; accessed on 15 June 2025). Overlapping targets obtained from the Venn diagram analysis were uploaded to the STRING platform for subsequent network construction, with Homo sapiens selected as the species of interest. After submission, the resulting PPI network diagram was subsequently generated and visualized. The TSV file containing information on core (hub) targets was downloaded for Excel analysis. Subsequently, the Network Analyzer plugin in Cytoscape was utilized to analyze the network structure and identify the key node. Nodes with degree values equal to or greater than twice the median were considered core targets, reflecting their central roles in the interaction network.
4.6. Docking Protocol Validation and Molecular Docking of BaP with Core Targets
The docking protocol was validated by redocking co-crystallized ligands into their respective binding sites. For each target, the native ligand—AQ4 (EGFR, 1M17), 29S (ESR1, 6PSJ), and KQV (STAT3, 6NJS)—was extracted and re-docked using identical grid parameters and settings as those employed for BaP. The grid box was centered on the original ligand with dimensions covering the entire binding site. RMSD values between redocked and experimental poses were all below 2.0 Å (Supplementary Table S3), confirming that the protocol accurately reproduces native binding modes. Following validation, molecular docking was performed to investigate BaP interactions with EGFR, ESR1, and STAT3—selected based on their high centrality in the PPI network and disease database annotations linking them to LPD and POF [42,47,51]. Their 3D structures were obtained from the PDB: EGFR (1M17, kinase domain with quinazoline inhibitor), ESR1 (6PSJ, ligand-binding pocket with agonist), and STAT3 (6NJS, SH2 domain with small-molecule inhibitor). For blind docking, the grid box was set to encompass the entire protein structure to allow unbiased binding site searching. Grid parameters (center coordinates, grid point numbers, spacing) are summarized in Supplementary Table S4. Protein structures were preprocessed with AutoDockTools 1.5.7 (water removal, polar hydrogen addition, Gasteiger charges). Docking simulations were performed using AutoDock Vina (version 1.2.7) with flexible BaP and rigid proteins. For each protein–ligand pair, 10 independent docking runs were conducted, and resulting conformations were clustered with 2.0 Å RMSD tolerance; the lowest-energy conformation from the most populated cluster was selected for analysis. Binding affinities and modes were evaluated. Calculations were performed on an Intel Core i7-12,700K workstation (12 cores, 32 GB RAM, Ubuntu 20.04 LTS) with AutoDock Vina configured to use 8 CPU threads and exhaustiveness = 32; average runtime was ~15 min per pair. Docking complexes were visualized in PyMOL (version 3.1.6.1), and interaction profiles (hydrogen bonds, hydrophobic contacts, π-stacking) were characterized using the Protein–Ligand Interaction Profiler [PLIP, developed by the Biotechnology Center (BIOTEC), TU Dresden, Tatzberg 47-49, 01307 Dresden, Germany; available at https://plip-tool.biotec.tu-dresden.de/plip-web/plip/index; accessed on 15 June 2025].
4.7. Molecular Dynamics Simulations
MD simulations of the complexes were performed for 100 nanoseconds using GROMACS 2022. The Charmm 36 force field was employed for the protein, while the Gaff2 force field was utilized for the ligand. The protein–ligand complex was solvated using the TIP3P water model in a periodic rectangular box, ensuring a minimum distance of 1.2 nm between any atom of the complex and the box edge. Electrostatic interactions were calculated using the Particle Mesh Ewald (PME) method, while the Verlet algorithm was employed for integrating the equations of motion. Prior to production runs, energy minimization was conducted for 100,000 steps, followed by equilibration in the NVT ensemble (isothermal-isochoric) and NPT ensemble (isothermal-isobaric) for 100 ps each, with a coupling constant of 0.1 ps. Both van der Waals and Coulomb interactions were calculated using a cutoff distance of 1.0 nm. Subsequently, the system was simulated under constant temperature (310 K) and constant pressure (1 bar) for a total duration of 100 ns. Finally, after system stabilization, the binding free energy was estimated using the MM/PBSA method.
4.8. Cell Culture and Protein Extraction
KGN cells (purchased from ATCC) were utilized for in vitro experiments. In our previous study [20], luteinization was induced in KGN cells by treating them with hCG (1.0 IU/L, Sigma, 3050 Spruce Street, St. Louis, MO 63103, USA), and the effects of BaP (5 µmol/L, purity 96%, Sigma) were investigated. The experiment design consisted of three groups: a control group (0.1% DMSO, MCE, 1 Deer Park Drive, Suite Q, Monmouth Junction, NJ 08852, USA), an hCG-treated group (1.0 IU/L hCG), and a co-treatment group (hCG + BaP). After 24 h of treatment, samples were collected. Cells were washed with Hank’s balanced salt solution (HBSS, Beijing Sunview Biotech Co., Ltd., No. 9 Kechuang 14th Street, BDA, Beijing, 101111, China), lysed, and centrifuged at 12,000 rpm for 15 min at 4 °C to obtain the supernatant. Protein concentration was quantified using the BCA assay. Samples were subsequently mixed with SDS-PAGE loading buffer (Beyotime Biotech Inc., 7F, #768, Jinshajiang Road, Putuo District, Shanghai, 200333, China), denatured at 100 °C for 10 min, cooled to room temperature, and stored at −80 °C for further analysis.
4.9. Western Blot
The collected samples were separated using the Easy PAGE Gel Fast Preparation Kit (Sevenblo, Beijing Sevenbio Biotech Co., Ltd., No. 8 Courtyard, Dongbei Wang West Road, Haidian District, Beijing, 100193, China) and subsequently transferred to polyvinylidene fluoride (PVDF, Milipore, 400 Summit Drive, Burlington, MA 01803, USA) membranes. The membranes were blocked with 5% BSA in a 37 °C shaker for 80 min. They were then incubated with the corresponding primary antibody overnight at 4 °C. Following this, the membranes were washed with PBS-T and incubated with appropriate secondary antibody (1:3000 dilution, ZSGB-BIO, No. 99 Life Park Road, Zhongguancun Life Science Park, Changping District, Beijing, 102206, China) at 37 °C for 1 h. Chemiluminescent detection was performed in a dark chamber using an ECL system. Gray scale analysis was conducted using Quantity One software (version 4.6.2), with β-actin as the internal reference. The primary antibody information is listed in Supplementary Table S6.
4.10. RNA Isolation and Quantitative PCR (qPCR)
Total RNA was extracted from KGN cells using Trizol reagent (Takara, Takara Bio Inc., 4-1 Nishiaku Shichijochuo, Kushiro, Hokkaido, 084-0922, Japan; distributed by Takara Biomedical Technology (Beijing) Co., Ltd., No. 29 Courtyard, Xinxidong Road, Daxing District, Beijing, 102600, China), and its concentration and purity were determined. The isolated RNA was reverse-transcribed into cDNA using an RT Master Mix for qPCR kit (MCE). qPCR analysis was conducted on a real-time PCR instrument (Bio-Rad, 1000 Alfred Nobel Drive, Hercules, CA 94547, USA) using 2× Universal SYBR Green Fast qPCR Mix (ABclonal, Building 4, Optics Valley Biomedical Industry Park, No. 388 Gaoxin 2nd Road, East Lake High-Tech Development Zone, Wuhan, Hubei, 430074, China) under the following conditions: predenaturation at 95 °C for 2 min, followed by 40 cycles of denaturation at 95 °C for 30 s and extension at 60 °C for 30 s. The mRNA expression levels were quantified using the Comparative CT method. Primers for qRT-PCR are listed in Supplementary Table S7.
4.11. Enzyme-Linked Immunosorbent Assay (ELISA)
Cells were treated with 0.25% trypsin for a brief period to obtain a single-cell suspension. Subsequently, the cells were centrifuged at 1000 rpm for 5 min and washed three times with sterile PBS. Cell lysis was achieved through a freeze–thaw cycle, involving freezing the cells at −20 °C or −80 °C followed by thawing at RT, repeated three times. Finally, the lysate was centrifuged at 2000 rpm for 20 min at 4 °C, and the supernatant was collected for subsequent analyses. The concentrations of target analytes in the preprocessed samples were determined following the instructions provided in the ELISA kit.
5. Conclusions
This study employed network toxicology, molecular docking, and MD simulation to investigate the ovarian toxicity mechanisms of BaP. We identified 36 potential targets for BaP-induced LPD and 43 roe POF, with EGFR, ESR1, and STAT3 emerging as core targets based on network centrality. Molecular docking confirmed stable binding affinities between BaP and these targets (ranging from −6.44 to −9.12 kcal·mol^−1^), while MD simulations demonstrated complex stability, with RMSD values indicating high-quality models for ESR1-BaP and STAT3-BaP complexes and moderate stability for EGFR-BaP. Cell-based validation further supported the reliability of these computational findings.
These results suggest that BaP may contribute to LPD and POF pathogenesis through dysregulation of EGFR, ESR1, and STAT3 signaling, potentially disrupting ovarian hormone synthesis and cell proliferation. Although computational models cannot fully replicate complex in vivo interactions, this integrated approach provides valuable toxicological insights and identifies promising therapeutic targets for clinical management of BaP-induced reproductive disorders. Further experimental studies are warranted to elucidate the precise molecular mechanisms underlying these interactions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Di Maggio D. Ovarian Odyssey JAMA 20243321703170410.1001/jama.2024.2053639480468 · doi ↗ · pubmed ↗
- 2Guo Y. Xue L. Tang W. Xiong J. Chen D. Dai Y. Wu C. Wei S. Dai J. Wu M. Ovarian Microenvironment: Challenges and Opportunities in Protecting against Chemotherapy-Associated Ovarian Damage Hum. Reprod. Updat.20243061464710.1093/humupd/dmae 020PMC 1136922838942605 · doi ↗ · pubmed ↗
- 3Gu D. Qi M. Gui R. Zhou D. Su C. Hu P. Liu L. Global Burden of Non-Neoplastic Gynecological Diseases in Women: A 32-Year Analysis with Projections to 2100 Am. J. Prev. Med.202510809910.1016/j.amepre.2025.10809940947072 · doi ↗ · pubmed ↗
- 4Practice Committees of the American Society for Reproductive Medicine and the Society for Reproductive Endocrinology and Infertility Diagnosis and Treatment of Luteal Phase Deficiency: A Committee Opinion Fertil. Steril.20211151416142310.1016/j.fertnstert.2021.02.01033827766 · doi ↗ · pubmed ↗
- 5van der Stege J.G. Groen H. van Zadelhoff S.J.N. Lambalk C.B. Braat D.D.M. van Kasteren Y.M. van Santbrink E.J.P. Apperloo M.J.A. Schultz W.C.M.W. Hoek A. Decreased Androgen Concentrations and Diminished General and Sexual Well-Being in Women with Premature Ovarian Failure Menopause 200815233110.1097/gme.0b 013e 3180 f 6108 c 18257141 · doi ↗ · pubmed ↗
- 6Welt C.K. Primary Ovarian Insufficiency: A More Accurate Term for Premature Ovarian Failure Clin. Endocrinol.20086849950910.1111/j.1365-2265.2007.03073.x 17970776 · doi ↗ · pubmed ↗
- 7Shi L. Cui L. Yang L. He L. Jia L. Bai W. Wang L. Xu W. Hotspots and Frontiers in Luteal Phase Defect Research: An in-Depth Global Trend Bibliometric and Visualization Analysis over a 52-Year Period Heliyon 202410 e 3508810.1016/j.heliyon.2024.e 3508839170162 PMC 11336435 · doi ↗ · pubmed ↗
- 8Huang C. Wang D. Li N. Yang C. Chen X. Liu X. He J. Ding Y. Tong C. Peng C. Exposure to Ethephon Compromises Endometrial Decidualization in Mice during Early Pregnancy via GPR 120Ecotoxicol. Environ. Saf.202122011236110.1016/j.ecoenv.2021.11236134052757 · doi ↗ · pubmed ↗
