Task-Aware Specialization for Efficient and Robust Dense Retrieval for   Open-Domain Question Answering

Hao Cheng; Hao Fang; Xiaodong Liu; Jianfeng Gao

arXiv:2210.05156·cs.CL·May 24, 2023

Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering

Hao Cheng, Hao Fang, Xiaodong Liu, Jianfeng Gao

PDF

Open Access 1 Repo

TL;DR

This paper introduces TASER, a task-aware model for dense retrieval in open-domain question answering that shares parameters to improve efficiency and robustness, outperforming traditional bi-encoder models and BM25.

Contribution

TASER enables parameter sharing in dense retrieval models by interleaving shared and specialized blocks, improving efficiency and robustness over existing bi-encoder architectures.

Findings

01

TASER surpasses BM25 in accuracy on five QA datasets.

02

TASER uses about 60% of the parameters of bi-encoder retrievers.

03

TASER demonstrates greater robustness in out-of-domain evaluations.

Abstract

Given its effectiveness on knowledge-intensive natural language processing tasks, dense retrieval models have become increasingly popular. Specifically, the de-facto architecture for open-domain question answering uses two isomorphic encoders that are initialized from the same pretrained model but separately parameterized for questions and passages. This bi-encoder architecture is parameter-inefficient in that there is no parameter sharing between encoders. Further, recent studies show that such dense retrievers underperform BM25 in various settings. We thus propose a new architecture, Task-aware Specialization for dense Retrieval (TASER), which enables parameter sharing by interleaving shared and specialized blocks in a single encoder. Our experiments on five question answering datasets show that TASER can achieve superior accuracy, surpassing BM25, while using about 60% of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/taser
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications