Using Large Pre-Trained Language Model to Assist FDA in Premarket   Medical Device

Zongzhe Xu

arXiv:2212.01217·cs.CL·December 5, 2022

Using Large Pre-Trained Language Model to Assist FDA in Premarket Medical Device

Zongzhe Xu

PDF

Open Access

TL;DR

This study explores using advanced NLP models to assist the FDA in classifying medical devices, significantly reducing manual effort and improving accuracy in device categorization.

Contribution

It evaluates the effectiveness of large pre-trained language models like sentence transformers and GPT-3 in automating device classification and error detection for FDA submissions.

Findings

01

Sentence transformers with T5, MPNet, and GPT-3 achieve high accuracy in classification.

02

Models effectively identify incorrectly labeled devices.

03

Difficulty remains in detecting closely related false classifications.

Abstract

This paper proposes a possible method using natural language processing that might assist in the FDA medical device marketing process. Actual device descriptions are taken and matched with the device description in FDA Title 21 of CFR to determine their corresponding device type. Both pre-trained word embeddings such as FastText and large pre-trained sentence embedding models such as sentence transformers are evaluated on their accuracy in characterizing a piece of device description. An experiment is also done to test whether these models can identify the devices wrongly classified in the FDA database. The result shows that sentence transformer with T5 and MPNet and GPT-3 semantic search embedding show high accuracy in identifying the correct classification by narrowing down the correct label to be contained in the first 15 most likely results, as compared to 2585 types of device…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntellectual Property and Patents

MethodsMulti-Head Attention · Attention Is All You Need · fail · Test · Cosine Annealing · Inverse Square Root Schedule · Adafactor · Gated Linear Unit · Attention Dropout · Linear Warmup With Cosine Annealing