Biomedical Nested NER with Large Language Model and UMLS Heuristics

Wenxin Zhou

arXiv:2407.05480·cs.CL·July 9, 2024·1 cites

Biomedical Nested NER with Large Language Model and UMLS Heuristics

Wenxin Zhou

PDF

Open Access

TL;DR

This paper introduces a biomedical nested named entity recognition system combining large language models and UMLS heuristics, achieving moderate F1 scores on BioNNE datasets.

Contribution

It presents a novel approach integrating large language models with UMLS-based heuristics for biomedical nested NER tasks.

Findings

01

F1 score of 0.39 on validation set

02

F1 score of 0.348 on test set

03

Highlights limitations and future directions

Abstract

In this paper, we present our system for the BioNNE English track, which aims to extract 8 types of biomedical nested named entities from biomedical text. We use a large language model (Mixtral 8x7B instruct) and ScispaCy NER model to identify entities in an article and build custom heuristics based on unified medical language system (UMLS) semantic types to categorize the entities. We discuss the results and limitations of our system and propose future improvements. Our system achieved an F1 score of 0.39 on the BioNNE validation set and 0.348 on the test set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsSparse Evolutionary Training