MedPromptExtract (Medical Data Extraction Tool): Anonymization and Hi-fidelity Automated data extraction using NLP and prompt engineering
Roomani Srivastava, Suraj Prasad, Lipika Bhat, Sarvesh Deshpande,, Barnali Das, Kshitij Jadhav

TL;DR
MedPromptExtract is an automated tool that combines NLP, prompt engineering, and LLMs to efficiently anonymize and extract high-fidelity clinical data from discharge summaries, aiding medical record digitization especially in resource-limited settings.
Contribution
The paper introduces MedPromptExtract, a novel integrated system that automates anonymization and data extraction from medical records using advanced NLP and prompt engineering techniques.
Findings
Anonymization pipeline takes 3 seconds per summary with successful clinician verification.
NLP extraction achieves 100% accuracy at 0.2 seconds per summary.
High fidelity in extracting AKI-related features with AUCs above 0.9.
Abstract
Introduction: The labour-intensive nature of data extraction from sources like discharge summaries (DS) poses significant obstacles to the digitisation of medical records particularly for low- and middle-income countries (LMICs). In this paper we present a completely automated method MedPromptExtract to efficiently extract data from DS while maintaining confidentiality. Methods: The source of data was Discharge Summaries (DS) from Kokilaben Dhirubhai Ambani Hospital (KDAH) of patients having Acute Kidney Injury (AKI). A pre-existing tool EIGEN which leverages semi-supervised learning techniques for high-fidelity information extraction was used to anonymize the DS, Natural Language Processing (NLP) was used to extract data from regular fields. We used Prompt Engineering and Large Language Model(LLM) to extract custom clinical information from free flowing text describing the patients…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare
