BioMamba: Domain-Adaptive Biomedical Language Models
Ling Yue, Mingzhi Zhu, Sixue Xing, Shaowu Pan, Vijil Chenthamarakshan, Yanbo Wang, Yunning Cao, Payel Das, Tianfan Fu

TL;DR
BioMamba is a family of biomedical language models that improves performance on biomedical and clinical tasks while maintaining general language ability through balanced domain-adaptive pretraining.
Contribution
We introduce BioMamba, a novel biomedical language model family that balances domain-specific adaptation with general language preservation using a mixed pretraining approach.
Findings
BioMamba improves PubMed and Wikipedia modeling without degrading C4 performance.
BioMamba transfers effectively to biomedical and clinical tasks with strong downstream results.
The best model achieves PubMed perplexity of 5.28 and high accuracy on BioASQ and PubMedQA.
Abstract
Background: Biomedical language models should improve performance on biomedical text while retaining general-domain language ability. For Mamba-based models, this trade-off has not been clearly studied across biomedical literature and clinical text. Methods: We developed BioMamba, a family of biomedical models obtained by continued pretraining of public Mamba2 checkpoints on PubMed, with small amounts of general-domain data from the Colossal Clean Crawled Corpus (C4) and Wikipedia included to help preserve general-domain language ability. We evaluated language modeling and three downstream tasks across multiple model scales: clinical note completion, discharge summary generation, and biomedical yes/no question answering. Results: BioMamba consistently improved PubMed modeling, improved Wikipedia modeling, and left C4 performance largely unchanged. After supervised fine-tuning, BioMamba…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Genetics, Bioinformatics, and Biomedical Research
MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces
