Enhanced vectors for top-k document retrieval in Question Answering
Mohammed Hammad

TL;DR
This paper introduces a novel dense vector embedding method for document retrieval in question answering systems, enabling fast and accurate identification of relevant passages with real-time query processing.
Contribution
It proposes a unique embedding technique that incorporates passage identifiers into dense vectors, improving retrieval efficiency and accuracy in QA applications.
Findings
Real-time query vector creation in ~4 milliseconds
Enhanced retrieval accuracy for relevant documents
Efficient embedding of passage identifiers into vector space
Abstract
Modern day applications, especially information retrieval webapps that involve "search" as their use cases are gradually moving towards "answering" modules. Conversational chatbots which have been proved to be more engaging to users, use Question Answering as their core. Since, precise answering is computationally expensive, several approaches have been developed to prefetch the most relevant documents/passages from the database that contain the answer. We propose a different approach that retrieves the evidence documents efficiently and accurately, making sure that the relevant document for a given user query is not missed. We do so by assigning each document (or passage in our case), a unique identifier and using them to create dense vectors which can be efficiently indexed. More precisely, we use the identifier to predict randomly sampled context window words of the relevant question…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Recommender Systems and Techniques · Text and Document Classification Technologies
