# Identifying 20 homogeneous clusters of acute patients discharged with nonspecific diagnoses through k-prototypes mixed data clustering

**Authors:** Rasmus Gregersen Mottlau, Marie Villumsen, Axel Nyström, Hanne Nygaard, Jens Rasmussen, Mikkel B. Christensen, Jakob Lundager Forberg, Janne Petersen

PMC · DOI: 10.1186/s12873-025-01459-7 · BMC Emergency Medicine · 2026-01-10

## TL;DR

This study groups patients discharged with vague diagnoses into 20 clusters using machine learning, revealing varying risks of readmission and mortality.

## Contribution

The novel use of k-prototypes clustering identifies 20 distinct patient clusters with nonspecific diagnoses, enabling more targeted prediction models.

## Key findings

- 20 distinct patient clusters were identified based on clinical, socioeconomic, and biochemical features.
- Risk of 30-day readmission and mortality varied significantly across clusters, from 5% to 27% and 0% to 9%, respectively.

## Abstract

Patients discharged with nonspecific diagnoses after acute hospital care are frequent and represent potential diagnostic uncertainty at discharge. Adverse outcomes indicate missed diagnoses with a potential for improving patient safety. However, research and interventions are limited by population heterogeneity. We aimed to identify clusters of patients discharged with nonspecific diagnoses by employing unsupervised machine learning and to assess the risk of readmission and mortality of each cluster.

Observational, register-based study of emergency department arrivals discharged with nonspecific diagnoses (ICD-10: R and Z03 chapters) from March 2019 to February 2020 in Denmark. We applied partitional (k-prototypes) and hierarchical (agglomerative) clustering based on demographics, socioeconomics, comorbidities, administrative information, biochemistry, and 50 nonspecific discharge diagnosis groups. The risk of 30-day readmission and mortality after discharge was assessed as cumulative incidence for each cluster.

We included 92,650 patients. A 20 clusters k-prototypes model best fitted our data. Clusters 1–5 were differentiated by no or limited biochemistry across different age and comorbidity patterns. Clusters 6–9 consisted mainly of young adults with low comorbidity, except Cluster 9 with notable neuropsychiatric and substance abuse comorbidities. Clusters 10–20 described the older patients: 10–14 with single comorbidities and 15–20 with substantial comorbidity of different cooccurring patterns. The risk of 30-day readmission and mortality ranged from 5% to 27% and 0% to 9% across clusters, respectively.

Patients with nonspecific discharge diagnoses after acute hospital contacts can be grouped into 20 distinct clusters based on clinical, socioeconomic, administrative, and biochemical features. The clusters can be used to form delimited populations allowing for better and more individualized prediction models.

The online version contains supplementary material available at 10.1186/s12873-025-01459-7.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12882614/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12882614/full.md

## References

7 references — full list in the complete paper: https://tomesphere.com/paper/PMC12882614/full.md

---
Source: https://tomesphere.com/paper/PMC12882614