# Intrinsic tumor factors and extrinsic environmental and social exposures contribute to endometrial cancer recurrence patterns

**Authors:** Jesus Gonzalez Bosquet, Oyomoare Osazuwa-Peters, Vincent M. Wagner, Andrew Polio, Rebecca Hoyd, Ahmad A. Tarhini, Casey M. Cosgrove, Marilyn S. Huang, Bradley R. Corr, Aliza L. Leiser, Bodour Salhia, Kathleen Darcy, Rob L. Dood, Lauren E. Dockery, Michael J. Cavnar, Lisa Landrum, Laura Chambers, Aik Choon Tan, Ning Jin, Robert J. Rounbehler, Michelle L. Churchman, Dan Spakowicz

PMC · DOI: 10.21203/rs.3.rs-8682460/v1 · Research Square · 2026-01-30

## TL;DR

This study shows that both tumor-related factors and environmental influences, like bacteria and air pollution, affect how endometrial cancer recurs, suggesting personalized models could improve predictions.

## Contribution

The study integrates microbiome, environmental, and clinical data to build predictive models of endometrial cancer recurrence across risk groups.

## Key findings

- Extrinsic factors like air pollution and tumor-associated bacteria significantly influence endometrial cancer recurrence patterns.
- Machine learning models combining clinical, genomic, and environmental data achieved excellent predictive performance (AUC ~0.9).
- Low-risk EC models showed higher relevance of BMI and specific bacterial genera like Bacillus compared to high-risk groups.

## Abstract

In a previous study, we trained, validated and tested models of endometrial cancer (EC) recurrence integrating clinical, genomic and pathological data from the Oncology Research Information Exchange Network (ORIEN). Preliminary studies also have demonstrated that bacterial communities may influence the risk of EC recurrence by altering the local environment within the upper female genital tract. The objective of this study was to evaluate whether extrinsic and environmental factors, including tumor-associated bacterial communities, tumor immune contexture and air pollution alongside clinical, pathologic and genomic features are associated with EC recurrence across clinically relevant risk groups.

We performed a retrospective, multi-institution, case–control study with data from the ORIEN network EC dataset. Data was stratified into low-risk, FIGO grade 1 and 2, stage I (N = 329), high-risk, or FIGO grade 3 or stages II-IV (N = 324), and non-endometrioid histology (N = 239) groups. RNA and DNA were extracted from tumor specimens and processed to obtain the necessary genomic/metagenomic data. Genus level microbiome data were extracted and curated) from RNA sequencing using Kraken2, Bracken and exotic software packages. Risk of EC recurrence was evaluated by integrating microbiome and environmental data alongside existing clinical, pathological and genomic data using topic modelling with latent dirichlet allocation (LDA). Prediction models of EC recurrence were created using machine and deep learning analytics (ML and DL) with MATLAB apps and TensorFlow. Finally, performance of both topic and prediction models were externally validated in an independent EC dataset from TCGA.

The resulting models, analyzed with topic modelling, demonstrated the complexity of factors involved in recurrence of disease for EC. The components of the resulting topic models, and specifically the microbiome, changed when environmental factors, like air pollutants, were introduced in the model. In the low-risk EC group, microbes that were quite abundant in models before introducing environmental factors, were scarcely seen afterwards, like genera Thermothielavioides, Theileria, Rhizoctonia. Bacillus was the genus with higher per-topic probability within all risk groups, especially for low-risk EC (28%). Ozone (O3) was a resulting component of all risk groups’ models. BMI was the sole informative clinical variable after data integration, and only present in the low-risk group. Resulting models from the high-risk and non-endometrioid groups included differential gene expressions: MMP13, S100A7, SMOC1, ACACA and ADD2, DLX5, SLCO2B1, NWD1 respectively. CNVs also were present in both low-risk and non-endometrioid groups, but their per-topic probabilities were low. The same was true for the immune contexture data. The components of the resulting topic models were used to train, validate and test prediction models of EC recurrence by risk groups. Performances of these models were excellent (@ 0.9). Despite some missing microbiome data in TCGA from resulting topic models, prediction models trained in the ORIEN set, had similar performances in TCGA testing set, with overlapping AUC 95% CIs.

Both extrinsic factors (tumor-associated bacterial communities, tumor immune contexture and air pollution) and intrinsic factors predict EC recurrence. The complexity of tumor and host factors influencing cancer relapses underscore the need for more individualized prediction models of disease outcomes.

## Linked entities

- **Genes:** MMP13 (matrix metallopeptidase 13) [NCBI Gene 4322], S100A7 (S100 calcium binding protein A7) [NCBI Gene 6278], SMOC1 (SPARC related modular calcium binding 1) [NCBI Gene 64093], ACACA (acetyl-CoA carboxylase alpha) [NCBI Gene 31], ADD2 (adducin 2) [NCBI Gene 119], DLX5 (distal-less homeobox 5) [NCBI Gene 1749], SLCO2B1 (solute carrier organic anion transporter family member 2B1) [NCBI Gene 11309], NWD1 (NACHT and WD repeat domain containing 1) [NCBI Gene 284434]
- **Chemicals:** Ozone (PubChem CID 24823), O3 (PubChem CID 24823)
- **Diseases:** endometrial cancer (MONDO:0002447)
- **Species:** Thermothielavioides (taxon 2609811), Theileria (taxon 5873), Rhizoctonia (taxon 1322061), Bacillus (taxon 1386)

## Full-text entities

- **Genes:** NWD1 (NACHT and WD repeat domain containing 1) [NCBI Gene 284434], DLX5 (distal-less homeobox 5) [NCBI Gene 1749] {aka SHFM1, SHFM1D}, MMP13 (matrix metallopeptidase 13) [NCBI Gene 4322] {aka CLG3, MANDP1, MDST, MMP-13}, ACACA (acetyl-CoA carboxylase alpha) [NCBI Gene 31] {aka ACAC, ACACAD, ACACalpha, ACC, ACC1, ACCA}, SMOC1 (SPARC related modular calcium binding 1) [NCBI Gene 64093] {aka OAS}, SLCO2B1 (solute carrier organic anion transporter family member 2B1) [NCBI Gene 11309] {aka OATP-B, OATP2B1, OATPB, SLC21A9}, ADD2 (adducin 2) [NCBI Gene 119] {aka ADDB}, S100A7 (S100 calcium binding protein A7) [NCBI Gene 6278] {aka PSOR1, S100A7c}
- **Diseases:** cancer (MESH:D009369), EC (MESH:D016889)
- **Chemicals:** O 3 (MESH:D010126)
- **Species:** Enterovirus C (no rank) [taxon 138950], Rhizoctonia (genus) [taxon 1322061], Bacillus (genus) [taxon 55087], Homo sapiens (human, species) [taxon 9606], Theileria (genus) [taxon 5873]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12869553/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12869553/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/PMC12869553/full.md

---
Source: https://tomesphere.com/paper/PMC12869553