The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models
Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin

TL;DR
The paper introduces the Expando-Mono-Duo design pattern for text ranking that leverages pretrained sequence-to-sequence models within a multi-stage architecture, achieving near state-of-the-art results across multiple retrieval tasks.
Contribution
It presents a novel design pattern combining document expansion and reranking models, validated across diverse domains with open-source implementations.
Findings
Effective in multiple retrieval benchmarks
Achieves near state-of-the-art performance
Operates with zero-shot learning in some cases
Abstract
We propose a design pattern for tackling text ranking problems, dubbed "Expando-Mono-Duo", that has been empirically validated for a number of ad hoc retrieval tasks in different domains. At the core, our design relies on pretrained sequence-to-sequence models within a standard multi-stage ranking architecture. "Expando" refers to the use of document expansion techniques to enrich keyword representations of texts prior to inverted indexing. "Mono" and "Duo" refer to components in a reranking pipeline based on a pointwise model and a pairwise model that rerank initial candidates retrieved using keyword search. We present experimental results from the MS MARCO passage and document ranking tasks, the TREC 2020 Deep Learning Track, and the TREC-COVID challenge that validate our design. In all these tasks, we achieve effectiveness that is at or near the state of the art, in some cases using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- castorini/msmarco_v1_doc_doc2query-t5_expansionsdataset· 47 dl47 dl
- castorini/msmarco_v1_doc_segmented_doc2query-t5_expansionsdataset· 43 dl43 dl
- castorini/msmarco_v1_passage_doc2query-t5_expansionsdataset· 44 dl44 dl
- castorini/msmarco_v2_doc_doc2query-t5_expansionsdataset· 65 dl65 dl
- castorini/msmarco_v2_doc_segmented_doc2query-t5_expansionsdataset· 46 dl46 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Text and Document Classification Technologies
