Task-aware Retrieval with Instructions
Akari Asai, Timo Schick, Patrick Lewis, Xilun Chen, Gautier Izacard,, Sebastian Riedel, Hannaneh Hajishirzi, Wen-tau Yih

TL;DR
This paper introduces TART, a multi-task retrieval system trained on a large instruction-based dataset, which effectively adapts to new tasks and outperforms larger models on zero-shot retrieval benchmarks.
Contribution
The paper presents BERRI, the first large-scale instruction-based retrieval dataset, and TART, a system trained on it that advances zero-shot retrieval performance and real-world task adaptability.
Findings
TART outperforms larger models on BEIR and LOTTE benchmarks.
TART demonstrates strong zero-shot generalization to new retrieval tasks.
X^2-Retrieval setup shows TART's effectiveness in diverse, real-world scenarios.
Abstract
We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries. We aim to develop a general-purpose task-aware retrieval system using multi-task instruction tuning, which can follow human-written instructions to find the best documents for a given query. We introduce the first large-scale collection of approximately 40 retrieval datasets with instructions, BERRI, and present TART, a multi-task retrieval system trained on BERRI with instructions. TART shows strong capabilities to adapt to a new retrieval task via instructions and advances the state of the art on two zero-shot retrieval benchmarks, BEIR and LOTTE, outperforming models up to three times larger. We further introduce a new evaluation setup, X^2-Retrieval to better reflect real-world scenarios, where diverse domains and tasks are pooled and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
