Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering
Sohee Yang, Minjoon Seo

TL;DR
This paper proposes strategies to drastically reduce the storage footprint of retrieve-and-read open-domain QA systems, making them viable for edge devices while maintaining or improving accuracy over parametric models.
Contribution
It introduces orthogonal methods to reduce storage by up to 160x, enabling efficient retrieve-and-read QA systems suitable for resource-constrained environments.
Findings
Storage reduction up to 160x achieved.
Retrieve-and-read outperforms parametric models in accuracy at similar system sizes.
Viable for deployment on edge devices.
Abstract
In open-domain question answering (QA), retrieve-and-read mechanism has the inherent benefit of interpretability and the easiness of adding, removing, or editing knowledge compared to the parametric approaches of closed-book QA models. However, it is also known to suffer from its large storage footprint due to its document corpus and index. Here, we discuss several orthogonal strategies to drastically reduce the footprint of a retrieve-and-read open-domain QA system by up to 160x. Our results indicate that retrieve-and-read can be a viable option even in a highly constrained serving environment such as edge devices, as we show that it can achieve better accuracy than a purely parametric model with comparable docker-level system size.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Expert finding and Q&A systems
