Large-Scale Knowledge Synthesis and Complex Information Retrieval from   Biomedical Documents

Shreya Saxena; Raj Sangani; Siva Prasad; Shubham Kumar; Mihir Athale,; Rohan Awhad; Vishal Vaddina

arXiv:2302.06854·cs.IR·February 15, 2023

Large-Scale Knowledge Synthesis and Complex Information Retrieval from Biomedical Documents

Shreya Saxena, Raj Sangani, Siva Prasad, Shubham Kumar, Mihir Athale,, Rohan Awhad, Vishal Vaddina

PDF

TL;DR

This paper presents a scalable, integrated system for extracting and retrieving complex biomedical information from large research datasets, enhancing the efficiency and accuracy of information retrieval in healthcare research.

Contribution

It introduces a comprehensive knowledge synthesis and retrieval framework combining lexical and semantic methods for complex biomedical queries, demonstrated on COVID-19 research data.

Findings

01

Effective retrieval of relevant research paragraphs and triplets

02

Enhanced question answering for complex biomedical queries

03

Demonstrated scalability on large datasets like CORD-19

Abstract

Recent advances in the healthcare industry have led to an abundance of unstructured data, making it challenging to perform tasks such as efficient and accurate information retrieval at scale. Our work offers an all-in-one scalable solution for extracting and exploring complex information from large-scale research documents, which would otherwise be tedious. First, we briefly explain our knowledge synthesis process to extract helpful information from unstructured text data of research documents. Then, on top of the knowledge extracted from the documents, we perform complex information retrieval using three major components- Paragraph Retrieval, Triplet Retrieval from Knowledge Graphs, and Complex Question Answering (QA). These components combine lexical and semantic-based methods to retrieve paragraphs and triplets and perform faceted refinement for filtering these search results. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.