BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text

Elliot Bolton; Abhinav Venigalla; Michihiro Yasunaga; David Hall,; Betty Xiong; Tony Lee; Roxana Daneshjou; Jonathan Frankle; Percy Liang,; Michael Carbin; Christopher D. Manning

arXiv:2403.18421·cs.CL·March 28, 2024·33 cites

BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text

Elliot Bolton, Abhinav Venigalla, Michihiro Yasunaga, David Hall,, Betty Xiong, Tony Lee, Roxana Daneshjou, Jonathan Frankle, Percy Liang,, Michael Carbin, Christopher D. Manning

PDF

Open Access 1 Repo 2 Models

TL;DR

BioMedLM is a compact 2.7B parameter biomedical language model trained solely on PubMed data, achieving competitive performance on biomedical NLP tasks and offering a privacy-preserving, efficient alternative to larger models.

Contribution

The paper introduces BioMedLM, a smaller biomedical language model trained exclusively on PubMed data, demonstrating competitive results and practical applications.

Findings

01

Achieves 57.3% on MedMCQA (dev)

02

Scores 69.0% on MMLU Medical Genetics

03

Can generate useful medical answers

Abstract

Models such as GPT-4 and Med-PaLM 2 have demonstrated impressive performance on a wide variety of biomedical NLP tasks. However, these models have hundreds of billions of parameters, are computationally expensive to run, require users to send their input data over the internet, and are trained on unknown data sources. Can smaller, more targeted models compete? To address this question, we build and release BioMedLM, a 2.7 billion parameter GPT-style autoregressive model trained exclusively on PubMed abstracts and full articles. When fine-tuned, BioMedLM can produce strong multiple-choice biomedical question-answering results competitive with much larger models, such as achieving a score of 57.3% on MedMCQA (dev) and 69.0% on the MMLU Medical Genetics exam. BioMedLM can also be fine-tuned to produce useful answers to patient questions on medical topics. This demonstrates that smaller…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stanford-crfm/biomedlm
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Topic Modeling

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Attention Is All You Need · Layer Normalization · Byte Pair Encoding · Softmax · Dropout · Multi-Head Attention