Improving Biomedical Information Retrieval with Neural Retrievers

Man Luo; Arindam Mitra; Tejas Gokhale; Chitta Baral

arXiv:2201.07745·cs.IR·January 20, 2022

Improving Biomedical Information Retrieval with Neural Retrievers

Man Luo, Arindam Mitra, Tejas Gokhale, Chitta Baral

PDF

Open Access 1 Video

TL;DR

This paper enhances biomedical information retrieval by developing neural retrievers trained with synthetic data, novel pre-training tasks, and a multi-vector encoding model, outperforming traditional methods especially in small-corpus scenarios.

Contribution

It introduces a template-based question generation method, two new pre-training tasks, and the Poly-DPR model for improved neural retrieval in biomedical domains.

Findings

01

Poly-DPR outperforms existing neural approaches.

02

Our method beats BM25 in small-corpus settings.

03

Hybrid models further improve retrieval in large-corpus scenarios.

Abstract

Information retrieval (IR) is essential in search engines and dialogue systems as well as natural language processing tasks such as open-domain question answering. IR serve an important function in the biomedical domain, where content and sources of scientific knowledge may evolve rapidly. Although neural retrievers have surpassed traditional IR approaches such as TF-IDF and BM25 in standard open-domain question answering tasks, they are still found lacking in the biomedical domain. In this paper, we seek to improve information retrieval (IR) using neural retrievers (NR) in the biomedical domain, and achieve this goal using a three-pronged approach. First, to tackle the relative lack of data in the biomedical domain, we propose a template-based question generation method that can be leveraged to train neural retriever models. Second, we develop two novel pre-training tasks that are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Improving Biomedical Information Retrieval with Neural Retrievers· underline

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education