Improve Dense Passage Retrieval with Entailment Tuning

Lu Dai; Hao Liu; Hui Xiong

arXiv:2410.15801·cs.CL·October 22, 2024

Improve Dense Passage Retrieval with Entailment Tuning

Lu Dai, Hao Liu, Hui Xiong

PDF

Open Access 1 Video

TL;DR

This paper introduces entailment tuning, a novel method that enhances dense passage retrieval by aligning relevance scoring with entailment concepts from NLI, leading to improved retrieval embeddings.

Contribution

It proposes a unified training approach using existence claims to connect retrieval and NLI data, improving dense retriever performance.

Findings

01

Enhanced retrieval accuracy demonstrated in experiments

02

Efficient integration with existing dense retrieval models

03

Better relevance modeling through entailment alignment

Abstract

Retrieval module can be plugged into many downstream NLP tasks to improve their performance, such as open-domain question answering and retrieval-augmented generation. The key to a retrieval system is to calculate relevance scores to query and passage pairs. However, the definition of relevance is often ambiguous. We observed that a major class of relevance aligns with the concept of entailment in NLI tasks. Based on this observation, we designed a method called entailment tuning to improve the embedding of dense retrievers. Specifically, we unify the form of retrieval data and NLI data using existence claim as a bridge. Then, we train retrievers to predict the claims entailed in a passage with a variant task of masked prediction. Our method can be efficiently plugged into current dense retrieval methods, and experiments show the effectiveness of our method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Improve Dense Passage Retrieval with Entailment Tuning· underline

Taxonomy

TopicsAlgorithms and Data Compression · Web Data Mining and Analysis · Natural Language Processing Techniques