Exploring Dual Encoder Architectures for Question Answering
Zhe Dong, Jianmo Ni, Daniel M. Bikel, Enrique Alfonseca, Yuan Wang,, Chen Qu, Imed Zitouni

TL;DR
This paper investigates various dual encoder architectures for question answering, demonstrating that sharing parameters in projection layers enhances performance and provides insights into the embedding space structure.
Contribution
It introduces improved dual encoder variants with shared or frozen components, showing parameter sharing in projection layers boosts QA retrieval effectiveness.
Findings
SDE outperforms ADE in QA tasks
Sharing projection layer parameters improves ADE performance
Parameter sharing influences embedding space structure as shown by t-SNE
Abstract
Dual encoders have been used for question-answering (QA) and information retrieval (IR) tasks with good results. Previous research focuses on two major types of dual encoders, Siamese Dual Encoder (SDE), with parameters shared across two encoders, and Asymmetric Dual Encoder (ADE), with two distinctly parameterized encoders. In this work, we explore different ways in which the dual encoder can be structured, and evaluate how these differences can affect their efficacy in terms of QA retrieval tasks. By evaluating on MS MARCO, open domain NQ and the MultiReQA benchmarks, we show that SDE performs significantly better than ADE. We further propose three different improved versions of ADEs by sharing or freezing parts of the architectures between two encoder towers. We find that sharing parameters in projection layers would enable ADEs to perform competitively with or outperform SDEs. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
