Searching for Best Practices in Retrieval-Augmented Generation
Xiaohua Wang, Zhenghua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo, Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qi Qian, Ruicheng Yin, Changze, Lv, Xiaoqing Zheng, Xuanjing Huang

TL;DR
This paper investigates various retrieval-augmented generation strategies to optimize performance and efficiency, and explores multimodal retrieval techniques to improve visual question answering and content generation.
Contribution
It systematically analyzes existing RAG approaches, proposes optimal deployment strategies, and demonstrates the benefits of multimodal retrieval for enhanced multimodal content generation.
Findings
Optimal RAG strategies balance performance and efficiency.
Multimodal retrieval significantly improves visual question-answering.
Retrieval as generation accelerates multimodal content creation.
Abstract
Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been proposed to enhance large language models through query-dependent retrievals, these approaches still suffer from their complex implementation and prolonged response times. Typically, a RAG workflow involves multiple processing steps, each of which can be executed in various ways. Here, we investigate existing RAG approaches and their potential combinations to identify optimal RAG practices. Through extensive experiments, we suggest several strategies for deploying RAG that balance both performance and efficiency. Moreover, we demonstrate that multimodal retrieval techniques can significantly enhance question-answering capabilities about visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Information Retrieval and Search Behavior · Speech and dialogue systems
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Weight Decay · Multi-Head Attention · Residual Connection · WordPiece · Softmax · Byte Pair Encoding · Layer Normalization
