Pruning the Index Contents for Memory Efficient Open-Domain QA

Martin Fajcik; Martin Docekal; Karel Ondrej; Pavel Smrz

arXiv:2102.10697·cs.CL·April 13, 2021

Pruning the Index Contents for Memory Efficient Open-Domain QA

Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz

PDF

2 Repos

TL;DR

This paper introduces R2-D2, a pipeline for open-domain QA that efficiently prunes large indexes, reducing size significantly while maintaining high accuracy.

Contribution

It proposes a novel R2-D2 pipeline combining retriever, reranker, and readers, along with a simple index pruning method for memory-efficient QA systems.

Findings

01

Index size reduced to 6GiB with only 8% content retained

02

Achieved only 3% loss in exact match accuracy

03

System components fit into a small Docker image

Abstract

This work presents a novel pipeline that demonstrates what is achievable with a combined effort of state-of-the-art approaches. Specifically, it proposes the novel R2-D2 (Rank twice, reaD twice) pipeline composed of retriever, passage reranker, extractive reader, generative reader and a simple way to combine them. Furthermore, previous work often comes with a massive index of external documents that scales in the order of tens of GiB. This work presents a simple approach for pruning the contents of a massive index such that the open-domain QA system altogether with index, OS, and library components fits into 6GiB docker image while retaining only 8% of original index contents and losing only 3% EM accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning