Categorical Co-Frequency Analysis: Clustering Diagnosis Codes to Predict Hospital Readmissions
Hallee E. Wong, Brianna C. Heggeseth, Steven J. Miller

TL;DR
This paper introduces Categorical Co-Frequency Analysis (CoFA), a novel clustering method for diagnosis codes that improves hospital readmission risk prediction by grouping similar diagnoses based on their relationship with readmission outcomes.
Contribution
The paper presents CoFA, a new clustering technique for ICD diagnosis codes that enhances predictive modeling of hospital readmissions without increasing model complexity.
Findings
CoFA clusters are predictive of readmission risk.
Replacing ICD majors with CoFA groups maintains prediction accuracy.
Homogeneity of readmission risk across diagnosis groups is observed.
Abstract
Accurately predicting patients' risk of 30-day hospital readmission would enable hospitals to efficiently allocate resource-intensive interventions. We develop a new method, Categorical Co-Frequency Analysis (CoFA), for clustering diagnosis codes from the International Classification of Diseases (ICD) according to the similarity in relationships between covariates and readmission risk. CoFA measures the similarity between diagnoses by the frequency with which two diagnoses are split in the same direction versus split apart in random forests to predict readmission risk. Applying CoFA to de-identified data from Berkshire Medical Center, we identified three groups of diagnoses that vary in readmission risk. To evaluate CoFA, we compared readmission risk models using ICD majors and CoFA groups to a baseline model without diagnosis variables. We found substituting ICD majors for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsChronic Disease Management Strategies · Machine Learning in Healthcare · Statistical Methods in Epidemiology
