Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution
Qiao Jin, Yin Fang, Lauren He, Yifan Yang, Guangzhi Xiong, Zhizheng Wang, Nicholas Wan, Joey Chan, Donald C. Comeau, Robert Leaman, Charalampos S. Floudas, Aidong Zhang, Michael F. Chiang, Yifan Peng, and Zhiyong Lu

TL;DR
Med-V1 is a small, 3-billion-parameter language model trained on synthetic biomedical data, achieving performance comparable to large models like GPT-5 in evidence attribution and hallucination detection tasks.
Contribution
This paper introduces Med-V1, a lightweight biomedical language model that outperforms base models and rivals larger models in evidence verification tasks.
Findings
Med-V1 outperforms base models by 27-71% on biomedical benchmarks.
Citation instructions significantly influence hallucination rates in LLMs.
Med-V1 can identify evidence misattributions in clinical guidelines.
Abstract
Assessing whether an article supports an assertion is essential for hallucination detection and claim verification. While large language models (LLMs) have the potential to automate this task, achieving strong performance requires frontier models such as GPT-5 that are prohibitively expensive to deploy at scale. To efficiently perform biomedical evidence attribution, we present Med-V1, a family of small language models with only three billion parameters. Trained on high-quality synthetic data newly developed in this study, Med-V1 substantially outperforms (+27.0% to +71.3%) its base models on five biomedical benchmarks unified into a verification format. Despite its smaller size, Med-V1 performs comparably to frontier LLMs such as GPT-5, along with high-quality explanations for its predictions. We use Med-V1 to conduct a first-of-its-kind use case study that quantifies hallucinations in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
