Improving ICD-based semantic similarity by accounting for varying degrees of comorbidity
Jan Janosch Schneider, Marius Adler, Christoph Ammer-Herrmenau, and Alexander Otto K\"onig, Ulrich Sax, Jonas H\"ugel

TL;DR
This paper improves ICD-based patient similarity measures by introducing a scale term that accounts for varying degrees of comorbidity, enhancing the accuracy of semantic similarity algorithms in real-world clinical data.
Contribution
The study presents a novel scale term to adjust for comorbidity variance and evaluates 80 algorithm combinations, identifying the most effective approach for ICD-based patient similarity.
Findings
Best algorithm combination achieved a correlation of 0.75 with expert ratings.
Accounting for comorbidity variance improves semantic similarity accuracy.
Current algorithms perform well when adjusted for comorbidity differences.
Abstract
Finding similar patients is a common objective in precision medicine, facilitating treatment outcome assessment and clinical decision support. Choosing widely-available patient features and appropriate mathematical methods for similarity calculations is crucial. International Statistical Classification of Diseases and Related Health Problems (ICD) codes are used worldwide to encode diseases and are available for nearly all patients. Aggregated as sets consisting of primary and secondary diagnoses they can display a degree of comorbidity and reveal comorbidity patterns. It is possible to compute the similarity of patients based on their ICD codes by using semantic similarity algorithms. These algorithms have been traditionally evaluated using a single-term expert rated data set. However, real-word patient data often display varying degrees of documented comorbidities that might impair…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Biomedical Text Mining and Ontologies · Chronic Disease Management Strategies
