Hybrid Text Feature Modeling for Disease Group Prediction using Unstructured Physician Notes
Gokul S Krishnan, Sowmya Kamath S

TL;DR
This paper introduces a deep learning approach using hybrid word embeddings to predict disease groups from unstructured physician notes, offering a cost-effective alternative to traditional EHR-based systems especially useful in developing countries.
Contribution
It presents a novel CDSS that leverages unstructured clinical notes with hybrid embeddings, outperforming models based on structured EHR data in disease prediction accuracy.
Findings
Outperformed state-of-the-art models by 15% in AUROC
Achieved 40% improvement in AUPRC
Proved effectiveness of unstructured text for disease prediction
Abstract
Existing Clinical Decision Support Systems (CDSSs) largely depend on the availability of structured patient data and Electronic Health Records (EHRs) to aid caregivers. However, in case of hospitals in developing countries, structured patient data formats are not widely adopted, where medical professionals still rely on clinical notes in the form of unstructured text. Such unstructured clinical notes recorded by medical personnel can also be a potential source of rich patient-specific information which can be leveraged to build CDSSs, even for hospitals in developing countries. If such unstructured clinical text can be used, the manual and time-consuming process of EHR generation will no longer be required, with huge person-hours and cost savings. In this paper, we propose a generic ICD9 disease group prediction CDSS built on unstructured physician notes modeled using hybrid word…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
