Parameter-efficient Zero-shot Transfer for Cross-Language Dense   Retrieval with Adapters

Eugene Yang; Suraj Nair; Dawn Lawrie; James Mayfield and; Douglas W. Oard

arXiv:2212.10448·cs.IR·December 21, 2022·1 cites

Parameter-efficient Zero-shot Transfer for Cross-Language Dense Retrieval with Adapters

Eugene Yang, Suraj Nair, Dawn Lawrie, James Mayfield and, Douglas W. Oard

PDF

Open Access

TL;DR

This paper introduces a parameter-efficient method using adapters for zero-shot cross-language dense retrieval, demonstrating improved transfer from monolingual models and analyzing the limitations of language adapter replacement at inference.

Contribution

It proposes a novel adapter-based transfer approach for cross-language retrieval, showing it outperforms full fine-tuning and analyzing the inefficacy of language adapter replacement during inference.

Findings

01

Adapters improve cross-language transfer in dense retrieval.

02

Replacing language adapters at inference is suboptimal for retrieval.

03

Monolingual models with adapters outperform full fine-tuning in CLIR.

Abstract

A popular approach to creating a zero-shot cross-language retrieval model is to substitute a monolingual pretrained language model in the retrieval model with a multilingual pretrained language model such as Multilingual BERT. This multilingual model is fined-tuned to the retrieval task with monolingual data such as English MS MARCO using the same training recipe as the monolingual retrieval model used. However, such transferred models suffer from mismatches in the languages of the input text during training and inference. In this work, we propose transferring monolingual retrieval models using adapters, a parameter-efficient component for a transformer network. By adding adapters pretrained on language tasks for a specific language with task-specific adapters, prior work has shown that the adapter-enhanced models perform better than fine-tuning the entire model when transferring across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Attention Dropout · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · WordPiece · Linear Warmup With Linear Decay