Does Biomedical Training Lead to Better Medical Performance?
Amin Dada, Marie Bauer, Amanda Butler Contreras, Osman Alperen Kora\c{s}, Constantin Marc Seibold, Kaleb E Smith, Jens Kleesiek

TL;DR
This study systematically evaluates biomedical LLMs on medical tasks, revealing that biomedical fine-tuning often reduces performance and general models can outperform domain-specific models, highlighting a trade-off in model training.
Contribution
It provides the first comprehensive evaluation of biomedical training effects on medical task performance, revealing potential drawbacks of domain-specific fine-tuning.
Findings
Biomedical fine-tuning often decreases model performance on medical tasks.
General-domain models can outperform biomedical models in medical tasks.
Open-source datasets and scripts facilitate further research.
Abstract
Large Language Models (LLMs) are expected to significantly contribute to patient care, diagnostics, and administrative processes. Emerging biomedical LLMs aim to address healthcare-specific challenges, including privacy demands and computational constraints. Assessing the models' suitability for this sensitive application area is of the utmost importance. However, biomedical training has not been systematically evaluated on medical tasks. This study investigates the effect of biomedical training in the context of six practical medical tasks evaluating models. In contrast to previous evaluations, our results reveal a performance decline in nine out of twelve biomedical models after fine-tuning, particularly on tasks involving hallucinations, ICD10 coding, and instruction adherence. General-domain models like Meta-Llama-3.1-70B-Instruct outperformed their biomedical counterparts,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsALIGN
