Two-Step Question Retrieval for Open-Domain QA

Yeon Seonwoo; Juhee Son; Jiho Jin; Sang-Woo Lee; Ji-Hoon Kim; Jung-Woo; Ha; Alice Oh

arXiv:2205.09393·cs.CL·May 20, 2022

Two-Step Question Retrieval for Open-Domain QA

Yeon Seonwoo, Juhee Son, Jiho Jin, Sang-Woo Lee, Ji-Hoon Kim, Jung-Woo, Ha, Alice Oh

PDF

Open Access 1 Repo

TL;DR

This paper introduces SQuID, a two-step question retrieval model that enhances open-domain QA performance while maintaining high inference speed, by combining sequential bi-encoders and distant supervision.

Contribution

It proposes a novel two-step retrieval approach with dual bi-encoders and training via distant supervision, improving QA accuracy without sacrificing speed.

Findings

01

SQuID significantly improves question retrieval accuracy.

02

The model maintains high inference speed with negligible loss.

03

Experimental results validate the effectiveness of the two-step approach.

Abstract

The retriever-reader pipeline has shown promising performance in open-domain QA but suffers from a very slow inference speed. Recently proposed question retrieval models tackle this problem by indexing question-answer pairs and searching for similar questions. These models have shown a significant increase in inference speed, but at the cost of lower QA performance compared to the retriever-reader models. This paper proposes a two-step question retrieval model, SQuID (Sequential Question-Indexed Dense retrieval) and distant supervision for training. SQuID uses two bi-encoders for question retrieval. The first-step retriever selects top-k similar questions, and the second-step retriever finds the most similar question from the top-k questions. We evaluate the performance and the computational efficiency of SQuID. The results show that SQuID significantly increases the performance of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yeonsw/squid
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Expert finding and Q&A systems