Using Large Pre-Trained Language Model to Assist FDA in Premarket Medical Device
Zongzhe Xu

TL;DR
This study explores using advanced NLP models to assist the FDA in classifying medical devices, significantly reducing manual effort and improving accuracy in device categorization.
Contribution
It evaluates the effectiveness of large pre-trained language models like sentence transformers and GPT-3 in automating device classification and error detection for FDA submissions.
Findings
Sentence transformers with T5, MPNet, and GPT-3 achieve high accuracy in classification.
Models effectively identify incorrectly labeled devices.
Difficulty remains in detecting closely related false classifications.
Abstract
This paper proposes a possible method using natural language processing that might assist in the FDA medical device marketing process. Actual device descriptions are taken and matched with the device description in FDA Title 21 of CFR to determine their corresponding device type. Both pre-trained word embeddings such as FastText and large pre-trained sentence embedding models such as sentence transformers are evaluated on their accuracy in characterizing a piece of device description. An experiment is also done to test whether these models can identify the devices wrongly classified in the FDA database. The result shows that sentence transformer with T5 and MPNet and GPT-3 semantic search embedding show high accuracy in identifying the correct classification by narrowing down the correct label to be contained in the first 15 most likely results, as compared to 2585 types of device…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntellectual Property and Patents
MethodsMulti-Head Attention · Attention Is All You Need · fail · Test · Cosine Annealing · Inverse Square Root Schedule · Adafactor · Gated Linear Unit · Attention Dropout · Linear Warmup With Cosine Annealing
