Initial Investigation of LLM-Assisted Development of Rule-Based Clinical NLP System

Jianlin Shi; Brian T. Bucher

arXiv:2506.16628·cs.CL·June 23, 2025

Initial Investigation of LLM-Assisted Development of Rule-Based Clinical NLP System

Jianlin Shi, Brian T. Bucher

PDF

Open Access

TL;DR

This paper explores using large language models to assist in developing rule-based clinical NLP systems, achieving high recall and key term extraction, potentially enabling faster, more cost-effective, and transparent NLP development.

Contribution

It introduces a novel approach leveraging LLMs during rule-based NLP development, focusing on snippet retrieval and keyword extraction in clinical notes.

Findings

01

High recall in snippet identification (Deepseek: 0.98, Qwen: 0.99)

02

Perfect extraction of key terms (1.0)

03

Potential for semi-automated, faster, and transparent NLP system development

Abstract

Despite advances in machine learning (ML) and large language models (LLMs), rule-based natural language processing (NLP) systems remain active in clinical settings due to their interpretability and operational efficiency. However, their manual development and maintenance are labor-intensive, particularly in tasks with large linguistic variability. To overcome these limitations, we proposed a novel approach employing LLMs solely during the rule-based systems development phase. We conducted the initial experiments focusing on the first two steps of developing a rule-based NLP pipeline: find relevant snippets from the clinical note; extract informative keywords from the snippets for the rule-based named entity recognition (NER) component. Our experiments demonstrated exceptional recall in identifying clinically relevant text snippets (Deepseek: 0.98, Qwen: 0.99) and 1.0 in extracting key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare