Generating Querying Code from Text for Multi-Modal Electronic Health Record
Mengliang ZHang

TL;DR
This paper introduces TQGen, a dataset and framework for translating natural language questions into database queries for electronic health records, addressing challenges of medical terminology and complex data structures.
Contribution
It presents a new dataset, TQGen, and a novel framework, TQGen-EHRQuery, incorporating a medical knowledge module and a toolset-based text processing approach.
Findings
The dataset effectively captures complex EHR querying scenarios.
The framework improves query accuracy and processing efficiency.
Experimental results validate the approach's potential in EHR systems.
Abstract
Electronic health records (EHR) contain extensive structured and unstructured data, including tabular information and free-text clinical notes. Querying relevant patient information often requires complex database operations, increasing the workload for clinicians. However, complex table relationships and professional terminology in EHRs limit the query accuracy. In this work, we construct a publicly available dataset, TQGen, that integrates both \textbf{T}ables and clinical \textbf{T}ext for natural language-to-query \textbf{Gen}eration. To address the challenges posed by complex medical terminology and diverse types of questions in EHRs, we propose TQGen-EHRQuery, a framework comprising a medical knowledge module and a questions template matching module. For processing medical text, we introduced the concept of a toolset, which encapsulates the text processing module as a callable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsElectronic Health Records Systems · Biomedical Text Mining and Ontologies · Topic Modeling
