Inferring disease correlation from healthcare data
Gargi Priyadarshini, Ashish Anand

TL;DR
This paper presents a method to extract and validate disease correlations from electronic health records using machine learning, NLP, and biomedical knowledge bases, revealing meaningful disease relationships and gene interactions.
Contribution
It introduces a novel approach combining NLP, bioinformatics, and statistical filtering to infer disease correlations from unstructured healthcare data.
Findings
Identified disease co-morbidities and risk factors from discharge summaries.
Validated disease relations against biomedical literature.
Linked disease correlations to gene interaction networks.
Abstract
Electronic Health Records maintained in health care settings are a potential source of substantial clinical knowledge. The massive volume of data, unstructured nature of records and obligatory requirement of domain acquaintance together pose a challenge in knowledge extraction from it. The aim of this study is to overcome this challenge with a methodical analysis, abstraction and summarization of such data. This is an attempt to explain clinical observations through bio-medical and genomic data. Discharge summaries of obesity patients were processed to extract coherent patterns. This was supported by Machine Learning and Natural Language Processing based technologies and concept mapping tool along with biomedical, clinical and genomic knowledge bases. Semantic relations between diseases were extracted and filtered through Chi square test to remove spurious relations. The remaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Bioinformatics and Genomic Networks · Machine Learning in Healthcare
