EQ-5D Classification Using Biomedical Entity-Enriched Pre-trained Language Models and Multiple Instance Learning
Zhyar Rzgar K Rostam, G\'abor Kert\'esz

TL;DR
This paper enhances EQ-5D publication detection by fine-tuning biomedical language models with entity enrichment and applying multiple instance learning, achieving high accuracy and recall for systematic review screening.
Contribution
It introduces a novel approach combining biomedical entity-enriched PLMs and MIL for improved detection of EQ-5D usage in abstracts.
Findings
F1-score improved to 0.82
Achieved nearly perfect recall at study level
Entity enrichment significantly boosts model performance
Abstract
The EQ-5D (EuroQol 5-Dimensions) is a standardized instrument for the evaluation of health-related quality of life. In health economics, systematic literature reviews (SLRs) depend on the correct identification of publications that use the EQ-5D, but manual screening of large volumes of scientific literature is time-consuming, error-prone, and inconsistent. In this study, we investigate fine-tuning of general-purpose (BERT) and domain-specific (SciBERT, BioBERT) pre-trained language models (PLMs), enriched with biomedical entity information extracted through scispaCy models for each statement, to improve EQ-5D detection from abstracts. We conduct nine experimental setups, including combining three scispaCy models with three PLMs, and evaluate their performance at both the sentence and study levels. Furthermore, we explore a Multiple Instance Learning (MIL) approach with attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Health Systems, Economic Evaluations, Quality of Life · Artificial Intelligence in Healthcare and Education
