Leveraging Prompt-Learning for Structured Information Extraction from Crohn's Disease Radiology Reports in a Low-Resource Language
Liam Hazan, Gili Focht, Naama Gavrielov, Roi Reichart, Talar Hagopian,, Mary-Louise C. Greer, Ruth Cytter Kuint, Dan Turner, Moti Freiman

TL;DR
This paper presents SMP-BERT, a prompt learning approach that effectively extracts structured information from Crohn's disease radiology reports in Hebrew, outperforming traditional methods especially for rare findings.
Contribution
Introduction of SMP-BERT, a novel prompt learning method tailored for low-resource languages and imbalanced medical datasets, improving extraction accuracy in radiology reports.
Findings
SMP-BERT achieved an AUC of 0.99, outperforming traditional methods.
SMP-BERT significantly improved F1 scores for rare conditions.
The method enhances AI diagnostics in low-resource language medical data.
Abstract
Automatic conversion of free-text radiology reports into structured data using Natural Language Processing (NLP) techniques is crucial for analyzing diseases on a large scale. While effective for tasks in widely spoken languages like English, generative large language models (LLMs) typically underperform with less common languages and can pose potential risks to patient privacy. Fine-tuning local NLP models is hindered by the skewed nature of real-world medical datasets, where rare findings represent a significant data imbalance. We introduce SMP-BERT, a novel prompt learning method that leverages the structured nature of reports to overcome these challenges. In our studies involving a substantial collection of Crohn's disease radiology reports in Hebrew (over 8,000 patients and 10,000 reports), SMP-BERT greatly surpassed traditional fine-tuning methods in performance, notably in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
