Automatic Extraction of Disease Risk Factors from Medical Publications

Maxim Rubchinsky; Ella Rabinovich; Adi Shraibman; Netanel Golan; Tali; Sahar; Dorit Shweiki

arXiv:2407.07373·cs.CL·July 11, 2024

Automatic Extraction of Disease Risk Factors from Medical Publications

Maxim Rubchinsky, Ella Rabinovich, Adi Shraibman, Netanel Golan, Tali, Sahar, Dorit Shweiki

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a multi-step system leveraging pre-trained bio-medical models to automatically identify and extract disease risk factors from medical literature, supported by new datasets and evaluation schemes.

Contribution

It presents a comprehensive pipeline for automated risk factor extraction and provides valuable, validated datasets for future research in medical text mining.

Findings

01

Encouraging automatic and manual evaluation results

02

Effective identification and extraction of risk factors

03

Highlighting the need for improved models and datasets

Abstract

We present a novel approach to automating the identification of risk factors for diseases from medical literature, leveraging pre-trained models in the bio-medical domain, while tuning them for the specific task. Faced with the challenges of the diverse and unstructured nature of medical articles, our study introduces a multi-step system to first identify relevant articles, then classify them based on the presence of risk factor discussions and, finally, extract specific risk factor information for a disease through a question-answering model. Our contributions include the development of a comprehensive pipeline for the automated extraction of risk factors and the compilation of several datasets, which can serve as valuable resources for further research in this area. These datasets encompass a wide range of diseases, as well as their associated risk factors, meticulously identified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

maximrub/diseases-risk-factors
noneOfficial

Videos

Automatic Extraction of Disease Risk Factors from Medical Publications· underline

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Topic Modeling