Large Language Model Can Be a Foundation for Hidden Rationale-Based   Retrieval

Luo Ji; Feixiang Guo; Teng Chen; Qingqing Gu; Xiaoyu Wang; Ningyuan; Xi; Yihong Wang; Peng Yu; Yue Zhao; Hongyang Lei; Zhonglin Jiang; Yong Chen

arXiv:2412.16615·cs.IR·April 10, 2025

Large Language Model Can Be a Foundation for Hidden Rationale-Based Retrieval

Luo Ji, Feixiang Guo, Teng Chen, Qingqing Gu, Xiaoyu Wang, Ningyuan, Xi, Yihong Wang, Peng Yu, Yue Zhao, Hongyang Lei, Zhonglin Jiang, Yong Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces RaHoRe, a retrieval framework using instruction-tuned LLMs for hidden rationale retrieval, demonstrating superior zero-shot and fine-tuned performance on emotional support conversations.

Contribution

It proposes a novel retrieval task and framework leveraging LLMs with instruction prompts and DPO, expanding retrieval capabilities beyond factual similarity.

Findings

01

RaHoRe outperforms previous methods on ESC dataset

02

Zero-shot and fine-tuned results show significant improvements

03

Efficient framework with no performance loss

Abstract

Despite the recent advancement in Retrieval-Augmented Generation (RAG) systems, most retrieval methodologies are often developed for factual retrieval, which assumes query and positive documents are semantically similar. In this paper, we instead propose and study a more challenging type of retrieval task, called hidden rationale retrieval, in which query and document are not similar but can be inferred by reasoning chains, logic relationships, or empirical experiences. To address such problems, an instruction-tuned Large language model (LLM) with a cross-encoder architecture could be a reasonable choice. To further strengthen pioneering LLM-based retrievers, we design a special instruction that transforms the retrieval task into a generative task by prompting LLM to answer a binary-choice question. The model can be fine-tuned with direct preference optimization (DPO). The framework is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

flyfree5/lahore
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling