Use GPT-J Prompt Generation with RoBERTa for NER Models on Diagnosis Extraction of Periodontal Diagnosis from Electronic Dental Records
Yao-Shun Chuang, Xiaoqian Jiang, Chun-Teh Lee, Ryan Brandon, Duong, Tran, Oluwabunmi Tokede, Muhammad F. Walji

TL;DR
This paper presents a novel approach combining GPT-J prompt generation with RoBERTa for extracting periodontal diagnoses from electronic dental records, demonstrating high accuracy and efficiency in NER tasks.
Contribution
It introduces a prompt generation method using GPT-J to enhance NER performance with RoBERTa in clinical text mining, emphasizing seed quality over quantity.
Findings
F1 score of 0.72 with optimized prompt ratios
Consistent high F1 scores of 0.92-0.97 after training
Efficient extraction of periodontal diagnoses from clinical notes
Abstract
This study explored the usability of prompt generation on named entity recognition (NER) tasks and the performance in different settings of the prompt. The prompt generation by GPT-J models was utilized to directly test the gold standard as well as to generate the seed and further fed to the RoBERTa model with the spaCy package. In the direct test, a lower ratio of negative examples with higher numbers of examples in prompt achieved the best results with a F1 score of 0.72. The performance revealed consistency, 0.92-0.97 in the F1 score, in all settings after training with the RoBERTa model. The study highlighted the importance of seed quality rather than quantity in feeding NER models. This research reports on an efficient and accurate way to mine clinical notes for periodontal diagnoses, allowing researchers to easily and quickly build a NER model with the prompt generation approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDental Radiography and Imaging · Oral microbiology and periodontitis research · Periodontal Regeneration and Treatments
MethodsMulti-Head Attention · Attention Is All You Need · Residual Connection · Layer Normalization · Linear Warmup With Linear Decay · Dense Connections · Dropout · Softmax · Linear Layer · WordPiece
