Initial Investigation of LLM-Assisted Development of Rule-Based Clinical NLP System
Jianlin Shi, Brian T. Bucher

TL;DR
This paper explores using large language models to assist in developing rule-based clinical NLP systems, achieving high recall and key term extraction, potentially enabling faster, more cost-effective, and transparent NLP development.
Contribution
It introduces a novel approach leveraging LLMs during rule-based NLP development, focusing on snippet retrieval and keyword extraction in clinical notes.
Findings
High recall in snippet identification (Deepseek: 0.98, Qwen: 0.99)
Perfect extraction of key terms (1.0)
Potential for semi-automated, faster, and transparent NLP system development
Abstract
Despite advances in machine learning (ML) and large language models (LLMs), rule-based natural language processing (NLP) systems remain active in clinical settings due to their interpretability and operational efficiency. However, their manual development and maintenance are labor-intensive, particularly in tasks with large linguistic variability. To overcome these limitations, we proposed a novel approach employing LLMs solely during the rule-based systems development phase. We conducted the initial experiments focusing on the first two steps of developing a rule-based NLP pipeline: find relevant snippets from the clinical note; extract informative keywords from the snippets for the rule-based named entity recognition (NER) component. Our experiments demonstrated exceptional recall in identifying clinically relevant text snippets (Deepseek: 0.98, Qwen: 0.99) and 1.0 in extracting key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare
