Cross-Institutional Dental EHR Entity Extraction via Generative AI and Synthetic Notes
Yao-Shun Chuang, Chun-Teh Lee, Oluwabunmi Tokede, Guo-Hao Lin, Ryan Brandon, Trung Duong Tran, Xiaoqian Jiang, Muhammad F. Walji

TL;DR
This study leverages GPT-4 generated synthetic notes and NLP models to improve the extraction of diagnostic information from unstructured dental records, significantly enhancing accuracy in periodontal diagnosis classification.
Contribution
It introduces a novel approach combining synthetic data from LLMs with NLP models to improve dental diagnostic information extraction from clinical notes.
Findings
Achieved 0.99 accuracy in periodontal status extraction at Site 1
Achieved 0.98 accuracy at Site 2
Outperformed existing methods in diagnostic extraction accuracy
Abstract
This research addresses the issue of missing structured data in dental records by extracting diagnostic information from unstructured text. The updated periodontology classification system's complexity has increased incomplete or missing structured diagnoses. To tackle this, we use advanced AI and NLP methods, leveraging GPT-4 to generate synthetic notes for fine-tuning a RoBERTa model. This significantly enhances the model's ability to understand medical and dental language. We evaluated the model using 120 randomly selected clinical notes from two datasets, demonstrating its improved diagnostic extraction accuracy. The results showed high accuracy in diagnosing periodontal status, stage, and grade, with Site 1 scoring 0.99 and Site 2 scoring 0.98. In the subtype category, Site 2 achieved perfect scores, outperforming Site 1. This method enhances extraction accuracy and broadens its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDental Radiography and Imaging
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Softmax · Absolute Position Encodings · Dense Connections · Dropout · Linear Layer · Attention Dropout
