DOGR: Leveraging Document-Oriented Contrastive Learning in Generative   Retrieval

Penghao Lu; Xin Dong; Yuansheng Zhou; Lei Cheng; Chuan Yuan; Linjian; Mo

arXiv:2502.07219·cs.IR·February 13, 2025

DOGR: Leveraging Document-Oriented Contrastive Learning in Generative Retrieval

Penghao Lu, Xin Dong, Yuansheng Zhou, Lei Cheng, Chuan Yuan, Linjian, Mo

PDF

Open Access 1 Video

TL;DR

This paper introduces DOGR, a novel generative retrieval framework that uses document-oriented contrastive learning to better model query-document relevance, achieving state-of-the-art results on benchmark datasets.

Contribution

It proposes a two-stage contrastive learning approach for generative retrieval, improving semantic understanding and relevance modeling over previous methods.

Findings

01

DOGR outperforms existing generative retrieval models on benchmark datasets.

02

Contrastive learning enhances semantic representations in retrieval tasks.

03

The framework is effective across various identifier construction techniques.

Abstract

Generative retrieval constitutes an innovative approach in information retrieval, leveraging generative language models (LM) to generate a ranked list of document identifiers (docid) for a given query. It simplifies the retrieval pipeline by replacing the large external index with model parameters. However, existing works merely learned the relationship between queries and document identifiers, which is unable to directly represent the relevance between queries and documents. To address the above problem, we propose a novel and general generative retrieval framework, namely Leveraging Document-Oriented Contrastive Learning in Generative Retrieval (DOGR), which leverages contrastive learning to improve generative retrieval tasks. It adopts a two-stage learning strategy that captures the relationship between queries and documents comprehensively through direct interactions. Furthermore,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

DOGR: Leveraging Document-Oriented Contrastive Learning in Generative Retrieval· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Information Retrieval and Search Behavior

MethodsContrastive Learning