LIR$^3$AG: A Lightweight Rerank Reasoning Strategy Framework for Retrieval-Augmented Generation
Guo Chen, Junjie Huang, Huaijin Xie, Fei Sun, Tao Jia

TL;DR
This paper introduces LiR$^3$AG, a lightweight framework that restructures retrieved evidence into reasoning chains, significantly reducing computational costs while enhancing non-reasoning models' performance in retrieval-augmented generation tasks.
Contribution
It proposes a novel framework that enables non-reasoning models to mimic reasoning strategies, reducing token and inference costs while surpassing larger reasoning models in multi-hop QA tasks.
Findings
Reduces output tokens by 98%
Cuts inference time by 58.6%
Improves 8B model's F1 score by up to 22.5%
Abstract
Retrieval-Augmented Generation (RAG) effectively enhances Large Language Models (LLMs) by incorporating retrieved external knowledge into the generation process. Reasoning models improve LLM performance in multi-hop QA tasks, which require integrating and reasoning over multiple pieces of evidence across different documents to answer a complex question. However, they often introduce substantial computational costs, including increased token consumption and inference latency. To better understand and mitigate this trade-off, we conduct a comprehensive study of reasoning strategies for reasoning models in RAG multi-hop QA tasks. Our findings reveal that reasoning models adopt structured strategies to integrate retrieved and internal knowledge, primarily following two modes: Context-Grounded Reasoning, which relies directly on retrieved content, and Knowledge-Reconciled Reasoning, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Information Retrieval and Search Behavior
