Overview of TREC 2024 Biomedical Generative Retrieval (BioGen) Track

Deepak Gupta; Dina Demner-Fushman; William Hersh; Steven Bedrick; and; Kirk Roberts

arXiv:2411.18069·cs.IR·December 17, 2024

Overview of TREC 2024 Biomedical Generative Retrieval (BioGen) Track

Deepak Gupta, Dina Demner-Fushman, William Hersh, Steven Bedrick, and, Kirk Roberts

PDF

Open Access

TL;DR

The paper discusses the TREC 2024 Biomedical Generative Retrieval (BioGen) Track, focusing on evaluating and improving the grounding of large language models in reliable biomedical sources to reduce hallucinations and false information.

Contribution

It introduces a pilot task on reference attribution to help mitigate false statements generated by LLMs in biomedical question answering.

Findings

01

Highlighting the challenge of hallucinations in biomedical LLMs

02

Proposing reference attribution as a solution to improve factual grounding

03

Setting up evaluation approaches for biomedical LLM reliability

Abstract

With the advancement of large language models (LLMs), the biomedical domain has seen significant progress and improvement in multiple tasks such as biomedical question answering, lay language summarization of the biomedical literature, clinical note summarization, etc. However, hallucinations or confabulations remain one of the key challenges when using LLMs in the biomedical and other domains. Inaccuracies may be particularly harmful in high-risk situations, such as medical question answering, making clinical decisions, or appraising biomedical research. Studies on the evaluation of the LLMs abilities to ground generated statements in verifiable sources have shown that models perform significantly worse on lay-user-generated questions, and often fail to reference relevant sources. This can be problematic when those seeking information want evidence from studies to back up the claims…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies