# A machine learning approach elucidates spatial patterns of environmental properties driving microbial composition over Santos Basin, South Atlantic

**Authors:** Julio Cezar Fornazier Moreira, Flúvio Modolon, Natascha Menezes Bergo, Danilo Candido Vieira, Gustavo Fonseca, Francielli Vilela Peres, Rebeca Graciela Matheus Lizárraga, Diana Carolina Duque-Castaño, Alice de Moura Emilio, Augusto Miliorini Amendola, Renato Gamba Romano, Mateus Gustavo Chuqui, Fabiana S Paula, Daniel Leite Moreira, Célio Roberto Jonck, Amanda Bendia, Frederico Pereira Brandini, Vivian Helena Pellizari

PMC · DOI: 10.1093/femsmc/xtag008 · FEMS Microbes · 2026-02-09

## TL;DR

This study uses machine learning to show how environmental factors shape microbial communities in Brazil's Santos Basin, offering a new way to monitor marine ecosystems.

## Contribution

A novel hybrid machine learning framework predicts depth-specific microbial associations with 86% accuracy in the Santos Basin.

## Key findings

- Five depth-specific microbial associations were identified, driven by temperature, salinity, density, and nutrients.
- Microbial diversity increases with depth, while cell abundance is higher in nutrient-rich shallow waters.
- Regional oceanographic processes like upwelling and plumes influence microbial community patterns.

## Abstract

Marine microbial communities are vital to biogeochemical cycling, yet their dynamics in regions of ecological and industrial significance, such as the Santos Basin (SB), Brazil’s largest offshore oil-producing basin, remain poorly resolved. To address this gap, we combined 16S rRNA amplicon sequencing, flow cytometry, and a hybrid machine learning framework (Self-Organizing Maps and Random Forest) to analyze microbial community stratification across pelagic depths in the SB. We identified five depth-specific microbial associations predicted with 86% accuracy, driven primarily by temperature, salinity, water density, and nutrient availability. Shallow epipelagic and mesopelagic zones were dominated by temperature-driven assemblages, while deeper bathypelagic communities responded to salinity and density gradients. Temporal and spatial patterns further highlighted the influence of regional oceanographic processes, including the Cabo Frio upwelling and Rio de la Plata plume. Microbial diversity increased with depth, contrasting with higher cell abundances in nutrient-rich shallow waters. We provided new insights into the relative importance of oceanographic processes, suggesting that vertical stratification and regional hydrography may play a more central role shaping microbial communities than previously recognized. We also established a predictive framework for microbial dynamics in marine ecosystems, with direct implications for assessing anthropogenic impacts in industrially active regions like the SB.

A data-driven study using machine learning reveals how environmental gradients structure depth-specific microbial communities in Brazil’s oil-rich Santos Basin, offering a predictive framework for ecosystem monitoring.

## Full-text entities

- **Genes:** PON1 (paraoxonase 1) [NCBI Gene 5444] {aka ESA, MVCD5, PON}, GRHL3 (grainyhead like transcription factor 3) [NCBI Gene 57822] {aka SOM, TFCP2L4, VWS2}, SRF (serum response factor) [NCBI Gene 6722] {aka MCM1}, SOS2 (SOS Ras/Rho guanine nucleotide exchange factor 2) [NCBI Gene 6655] {aka NS9, SOS-2}
- **Diseases:** DCM (MESH:D057887)
- **Chemicals:** agarose (MESH:D012685), iron (MESH:D007501), SYBR Green I (MESH:C098022), Water (MESH:D014867), NO3 (MESH:C038619), acetic acid (MESH:D019342), glutaraldehyde (MESH:D005976), oxygen (MESH:D010100), ammonia (MESH:D000641), phosphate (MESH:D010710), nitrate (MESH:D009566), sulfur (MESH:D013455), silicate (MESH:D017640), BL3 (-), oil (MESH:D009821), acetone (MESH:D000096), carbon (MESH:D002244), chlorophyll (MESH:D002734), POC (MESH:C042234), nitrite (MESH:D009573), nitrogen (MESH:D009584), EDTA (MESH:D004492)
- **Species:** Pseudomonadota (proteobacteria, phylum) [taxon 1224], Candidatus Poseidoniales (order) [taxon 133814], Fidelibacterota (Marine Group A, phylum) [taxon 62680], Cyanobacteriota (blue-green algae, phylum) [taxon 1117], Synechococcus (genus) [taxon 1129], Prochlorococcus (genus) [taxon 1218], Dehalococcoidia (class) [taxon 301297], Homo sapiens (human, species) [taxon 9606], Thermoplasmata (class) [taxon 183967], Bacteria Latreille et al. 1825 (Bacteria stick insect, genus) [taxon 629395], Flavobacteriales (order) [taxon 200644], Planctomycetia (class) [taxon 203683], Planctomycetota (phylum) [taxon 203682]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12951518/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12951518/full.md

## References

66 references — full list in the complete paper: https://tomesphere.com/paper/PMC12951518/full.md

---
Source: https://tomesphere.com/paper/PMC12951518