Unsupervised Large Language Model Alignment for Information Retrieval   via Contrastive Feedback

Qian Dong; Yiding Liu; Qingyao Ai; Zhijing Wu; Haitao Li; Yiqun Liu,; Shuaiqiang Wang; Dawei Yin; Shaoping Ma

arXiv:2309.17078·cs.IR·March 27, 2024

Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback

Qian Dong, Yiding Liu, Qingyao Ai, Zhijing Wu, Haitao Li, Yiqun Liu,, Shuaiqiang Wang, Dawei Yin, Shaoping Ma

PDF

Open Access

TL;DR

This paper introduces an unsupervised reinforcement learning method called RLCF that improves large language models' ability to generate distinctive, context-specific responses for information retrieval tasks by leveraging contrastive feedback signals.

Contribution

The paper presents RLCF, a novel unsupervised alignment technique using contrastive feedback and reinforcement learning to enhance LLMs' performance in IR tasks.

Findings

01

RLCF significantly outperforms existing alignment methods.

02

RLCF-optimized LLMs produce more distinctive responses.

03

The approach is effective across different languages and model sizes.

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities across various research domains, including the field of Information Retrieval (IR). However, the responses generated by off-the-shelf LLMs tend to be generic, i.e., cannot capture the distinctiveness of each document with similar content. This limits the performance of LLMs in IR because finding and distinguishing relevant documents from substantial similar documents is a typical problem in many IR tasks. To address this issue, we propose an unsupervised alignment method, namely Reinforcement Learning from Contrastive Feedback (RLCF), empowering LLMs to generate both high-quality and context-specific responses. Our approach constructs unsupervised contrastive feedback signals based on similar document groups, and adopts a reward function, named group-wise reciprocal rank, to optimize LLMs within a standard Proximal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods