# Predicting neighborhood-level violence from features of the physical and social environment with machine learning

**Authors:** Veronica A. Pear, Colette Smirniotis, Rose M. C. Kagawa

PMC · DOI: 10.1186/s40621-025-00629-2 · Injury Epidemiology · 2025-11-10

## TL;DR

This study uses machine learning to predict neighborhood-level violence based on physical and social environment features in two Midwestern cities.

## Contribution

The novel use of extreme gradient boosting to model 55 features simultaneously without assumptions about their relationships.

## Key findings

- Primary models achieved high correlation (0.89 for violent crime and 0.65 for firearm-involved violent crime).
- Building quality, socioeconomic features, and multifamily homes were among the most important predictors of violence.

## Abstract

Violence is a leading cause of death and disparity in the United States. Individuals’ physical and social environments can prevent or foster violence, but these complex milieus are challenging to model. In this study, we used machine learning to identify features of the local environment that are most predictive of violence in two Midwestern cities struggling with disinvestment and crime.

This was a serial cross-sectional study of census tracts in Cleveland, Ohio and Detroit, Michigan, 2011–2019. We took a machine learning approach—extreme gradient boosting—that enabled us to model 55 neighborhood features simultaneously and without making assumptions about their relationships or functional form. These features included building quality and type, public goods and services, residential stability, socioeconomic features, historical features, and demographic features. Primary outcomes were police-reported counts per square mile of violent crime and violent crime involving a firearm in Cleveland. Secondary outcomes were homicide and firearm homicide in Cleveland and Detroit. Variable importance was assessed with Shapley values.

The primary models performed well, with a correlation between observed and predicted counts of 0.89 for violent crime and 0.65 for firearm-involved violent crime. For both outcomes, the variables with the highest importance tended to be in the domains of building quality and type or socioeconomic features. Several variables had high importance for both outcomes, including multifamily homes per square mile, road network density, commercial buildings per square mile, and percentage of the population that was white.

These findings underscore the fundamental importance of place in preventing and generating violence. Future studies should explore modifiable, highly important variables as potential points of intervention.

The online version contains supplementary material available at 10.1186/s40621-025-00629-2.

## Full-text entities

- **Diseases:** death (MESH:D003643)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12604338/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12604338/full.md

## References

6 references — full list in the complete paper: https://tomesphere.com/paper/PMC12604338/full.md

---
Source: https://tomesphere.com/paper/PMC12604338