MedCite: Can Language Models Generate Verifiable Text for Medicine?

Xiao Wang; Mengjue Tan; Qiao Jin; Guangzhi Xiong; Yu Hu; Aidong Zhang; Zhiyong Lu; Minjia Zhang

arXiv:2506.06605·cs.CL·June 10, 2025

MedCite: Can Language Models Generate Verifiable Text for Medicine?

Xiao Wang, Mengjue Tan, Qiao Jin, Guangzhi Xiong, Yu Hu, Aidong Zhang, Zhiyong Lu, Minjia Zhang

PDF

Open Access

TL;DR

MedCite is an end-to-end framework that enables the generation and evaluation of verifiable citations by language models for medical question-answering, addressing a key gap in current AI medical systems.

Contribution

Introduces MedCite, the first comprehensive framework for citation generation and evaluation in medical LLM applications, including a novel multi-pass retrieval-citation method.

Findings

01

Improved citation precision and recall over baseline methods

02

Evaluation correlates well with expert annotations

03

Highlights key design choices impacting citation quality

Abstract

Existing LLM-based medical question-answering systems lack citation generation and evaluation capabilities, raising concerns about their adoption in practice. In this work, we introduce \name, the first end-to-end framework that facilitates the design and evaluation of citation generation with LLMs for medical tasks. Meanwhile, we introduce a novel multi-pass retrieval-citation method that generates high-quality citations. Our evaluation highlights the challenges and opportunities of citation generation for medical tasks, while identifying important design choices that have a significant impact on the final citation quality. Our proposed method achieves superior citation precision and recall improvements compared to strong baseline methods, and we show that evaluation results correlate well with annotation results from professional experts.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Artificial Intelligence in Healthcare and Education