DOGR: Leveraging Document-Oriented Contrastive Learning in Generative Retrieval
Penghao Lu, Xin Dong, Yuansheng Zhou, Lei Cheng, Chuan Yuan, Linjian, Mo

TL;DR
This paper introduces DOGR, a novel generative retrieval framework that uses document-oriented contrastive learning to better model query-document relevance, achieving state-of-the-art results on benchmark datasets.
Contribution
It proposes a two-stage contrastive learning approach for generative retrieval, improving semantic understanding and relevance modeling over previous methods.
Findings
DOGR outperforms existing generative retrieval models on benchmark datasets.
Contrastive learning enhances semantic representations in retrieval tasks.
The framework is effective across various identifier construction techniques.
Abstract
Generative retrieval constitutes an innovative approach in information retrieval, leveraging generative language models (LM) to generate a ranked list of document identifiers (docid) for a given query. It simplifies the retrieval pipeline by replacing the large external index with model parameters. However, existing works merely learned the relationship between queries and document identifiers, which is unable to directly represent the relevance between queries and documents. To address the above problem, we propose a novel and general generative retrieval framework, namely Leveraging Document-Oriented Contrastive Learning in Generative Retrieval (DOGR), which leverages contrastive learning to improve generative retrieval tasks. It adopts a two-stage learning strategy that captures the relationship between queries and documents comprehensively through direct interactions. Furthermore,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Information Retrieval and Search Behavior
MethodsContrastive Learning
