# Predicting Low Birth Weight in Big Cities in the United States Using a Machine Learning Approach

**Authors:** Yulia Treister-Goltzman

PMC · DOI: 10.3390/ijerph22060934 · International Journal of Environmental Research and Public Health · 2025-06-13

## TL;DR

This study uses machine learning to predict low birth weight rates in large U.S. cities, identifying key factors like poverty and prenatal care.

## Contribution

The study introduces a machine learning approach to predict low birth weight at the population level in big U.S. cities.

## Key findings

- Machine learning models achieved high performance with R-squared values above 0.79.
- Key predictors included chlamydia infection rates, racial segregation, and prenatal care.
- The Best subset model provided the best balance of accuracy with only four predictors.

## Abstract

Objective: Low birth weight is a serious public health problem even in developed countries. The objective of this study was to assess the ability of machine learning to predict low birth weight rates in big cities in the USA on an ecological/population level. Study design: The study was based on publicly available data from the Big Cities Health Inventory Data Platform. The collected data related to the 35 largest, most urban cities in the United States from 2010 to 2022. The model-agnostic approach was used to assess and visualize the magnitude and direction of the most influential predictors. Results: The models showed excellent performance with R-squared values of 0.82, 0.81, 0.81, and 0.79, and residual root mean squared error values of 1.06, 0.87, 1.03, 0.99 for KNN, Best subset, Lasso, and XGBoost, respectively. It is noteworthy that the Best subset selection approach had a high RSq and the lowest residual root mean squared error, with only a four-predictor subset. Influential predictors that appeared in three/four models were rate of chlamydia infection, racial segregation, prenatal care, percentage of single-parent families, and poverty. Other important predictors were the rate of violent crimes, life expectancy, mental distress, income inequality, hazardous air quality, prevalence of hypertension, percent of foreign-born citizens, and smoking. This study was limited by the unavailability of data on gestational age. Conclusions: The machine learning algorithms showed excellent performance for the prediction of low birth weight rate in big cities. The identification of influential predictors can help local and state authorities and health policy decision makers to more effectively tackle this important health problem.

## Linked entities

- **Diseases:** chlamydia infection (MONDO:0021697)

## Full-text entities

- **Diseases:** hypertension (MESH:D006973), chlamydia infection (MESH:D002690)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12192627/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12192627/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/PMC12192627/full.md

---
Source: https://tomesphere.com/paper/PMC12192627