Analysing Lightweight Large Language Models for Biomedical Named Entity Recognition on Diverse Ouput Formats
Pierre Epron (HeKA | U1346, DIG), Adrien Coulet (HeKA | U1346), Mehwish Alam (IP Paris, DIG)

TL;DR
This paper evaluates lightweight LLMs for biomedical named entity recognition, analyzing how different output formats affect their performance and identifying formats that yield better results.
Contribution
It provides an experimental analysis showing lightweight LLMs can perform competitively in biomedical NER and explores the impact of output formats on their effectiveness.
Findings
Lightweight LLMs achieve competitive biomedical NER performance.
Certain output formats are consistently associated with better performance.
Instruction tuning over multiple formats does not improve results.
Abstract
Despite their strong linguistic capabilities, Large Language Models (LLMs) are computationally demanding and require substantial resources for fine-tuning, which is unadapted to privacy and budget constraints of many healthcare settings. To address this, we present an experimental analysis focused on Biomedical Named Entity Recognition using lightweight LLMs, we evaluate the impact of different output formats on model performance. The results reveal that lightweight LLMs can achieve competitive performance compared to the larger models, highlighting their potential as lightweight yet effective alternatives for biomedical information extraction. Our analysis shows that instruction tuning over many distinct formats does not improve performance, but identifies several format consistently associated with better performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
