Large Language Models for Biomedical Article Classification

Jakub Proboszcz; Pawe{\l} Cichosz

arXiv:2603.11780·cs.CL·March 13, 2026

Large Language Models for Biomedical Article Classification

Jakub Proboszcz, Pawe{\l} Cichosz

PDF

Open Access

TL;DR

This paper systematically evaluates large language models for biomedical article classification, comparing various configurations and prompting methods, and finds they perform comparably to traditional classifiers in challenging datasets.

Contribution

It provides a comprehensive analysis of LLMs as classifiers in biomedical domains, including prompt types, output processing, and few-shot strategies, with practical recommendations.

Findings

01

Average PR AUC above 0.4 for zero-shot prompting

02

Nearly 0.5 PR AUC for few-shot prompting

03

Performance close to traditional classifiers like Naive Bayes and Random Forest

Abstract

This work presents a systematic and in-depth investigation of the utility of large language models as text classifiers for biomedical article classification. The study uses several small and mid-size open source models, as well as selected closed source ones, and is more comprehensive than most prior work with respect to the scope of evaluated configurations: different types of prompts, output processing methods for generating both class and class probability predictions, as well as few-shot example counts and selection methods. The performance of the most successful configurations is compared to that of conventional classification algorithms. The obtained average PR AUC over 15 challenging datasets above 0.4 for zero-shot prompting and nearly 0.5 for few-shot prompting comes close to that of the na\"ive Bayes classifier (0.5), the random forest algorithm (0.5 with default settings or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Text and Document Classification Technologies