IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning   Inner Monologues

Diji Yang; Jinmeng Rao; Kezhen Chen; Xiaoyuan Guo; Yawen Zhang; Jie; Yang; Yi Zhang

arXiv:2405.13021·cs.CL·May 24, 2024

IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

Diji Yang, Jinmeng Rao, Kezhen Chen, Xiaoyuan Guo, Yawen Zhang, Jie, Yang, Yi Zhang

PDF

Open Access

TL;DR

IM-RAG introduces a novel multi-round retrieval-augmented generation framework that leverages learned inner monologues and reinforcement learning to improve reasoning, interpretability, and flexibility in large language models.

Contribution

The paper presents IM-RAG, a new LLM-centric approach integrating IR systems with learned inner monologues, optimized via RL and SFT for enhanced multi-round reasoning.

Findings

01

Achieves state-of-the-art results on HotPotQA.

02

Provides high interpretability through learned inner monologues.

03

Enhances flexibility in IR module integration.

Abstract

Although the Retrieval-Augmented Generation (RAG) paradigms can use external knowledge to enhance and ground the outputs of Large Language Models (LLMs) to mitigate generative hallucinations and static knowledge base problems, they still suffer from limited flexibility in adopting Information Retrieval (IR) systems with varying capabilities, constrained interpretability during the multi-round retrieval process, and a lack of end-to-end optimization. To address these challenges, we propose a novel LLM-centric approach, IM-RAG, that integrates IR systems with LLMs to support multi-round RAG through learning Inner Monologues (IM, i.e., the human inner voice that narrates one's thoughts). During the IM process, the LLM serves as the core reasoning model (i.e., Reasoner) to either propose queries to collect more information via the Retriever or to provide a final answer based on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece · Linear Warmup With Linear Decay · Attention Dropout · Linear Layer · Multi-Head Attention · Residual Connection · Weight Decay · Byte Pair Encoding