# Predicting High Urinary Tract Infection Rates in Skilled Nursing Facilities: A Machine Learning Approach

**Authors:** Diane Dolezel, Tiankai Wang, Denise Gobert

PMC · DOI: 10.3390/healthcare13202632 · Healthcare · 2025-10-20

## TL;DR

This study uses machine learning to predict high UTI rates in nursing facilities based on facility characteristics like location and staffing.

## Contribution

The study introduces a machine learning approach to predict UTI risk in SNFs using facility-level data.

## Key findings

- Machine learning models outperformed logistic regression in predicting UTI rates.
- Rural SNFs and the number of staffed beds were the most influential predictors of high UTI rates.
- Average length of stay and geographic location also significantly influenced UTI risk.

## Abstract

Objectives: Urinary tract infections (UTIs) are the most common healthcare-associated infections in Skilled Nursing Facilities (SNFs); they are associated with longer lengths of stay, higher levels of care, increased treatment costs, and higher mortality rates. This study aimed to develop a machine learning classification model to predict the risk of high catheter-associated urinary tract infection rates based on SNF characteristics. Methods: We analyzed 94,877 total SNF-year observations from 2019 to 2024, not unique facilities; thus, individual SNFs may appear in multiple years. The factor variables were average length of stay in days, number of staffed beds, total nurse and total physical therapy staffing hours per resident per day, facility ownership, geographic classification, facility accreditation, Accountable Care Organization affiliations, Centers for Medicare and Medicaid Services SNF Overall Star Rating, and the SNF-year of the observations. We utilized three machine learning models for this analysis: Random Forest, XGBoost, and LightGBM. We used Shapley Additive exPlanations to interpret the best-performing machine learning model by visualizing feature importance and examining the relationship between key predictors and the outcome. Results: We found that machine learning models outperformed traditional logistic regression in predicting UTIs in skilled nursing facilities. Using the best-performing model, Random Forest, we identified rural SNFs, and the number of staffed beds as the most influential predictors of high UTI rates, followed by average length of stay, and geographic location. Conclusions: This study demonstrates the value of using facility-level characteristics to predict the risk of UTIs in SNFs with machine learning models. Results from this study can inform infection prevention efforts in post-acute care settings.

## Full-text entities

- **Diseases:** UTIs (MESH:D014552), infection (MESH:D007239)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12564364/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12564364/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC12564364/full.md

---
Source: https://tomesphere.com/paper/PMC12564364