Multi-Reranker: Maximizing performance of retrieval-augmented generation in the FinanceRAG challenge
Joohyun Lee, Minji Roh

TL;DR
This paper presents a high-performance, finance-specific Retrieval-Augmented Generation system optimized through ablation studies and reranker models, achieving second place in the ACM-ICAIF '24 FinanceRAG challenge.
Contribution
It introduces novel methods for query expansion, corpus refinement, and long-context management to enhance retrieval accuracy and response quality in financial data analysis.
Findings
Achieved 2nd place in the FinanceRAG Challenge
Improved retrieval accuracy with multiple rerankers
Enhanced long-context handling for better responses
Abstract
As Large Language Models (LLMs) increasingly address domain-specific problems, their application in the financial sector has expanded rapidly. Tasks that are both highly valuable and time-consuming, such as analyzing financial statements, disclosures, and related documents, are now being effectively tackled using LLMs. This paper details the development of a high-performance, finance-specific Retrieval-Augmented Generation (RAG) system for the ACM-ICAIF '24 FinanceRAG competition. We optimized performance through ablation studies on query expansion and corpus refinement during the pre-retrieval phase. To enhance retrieval accuracy, we employed multiple reranker models. Notably, we introduced an efficient method for managing long context sizes during the generation phase, significantly improving response quality without sacrificing performance. We ultimately achieve 2nd place in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Caching and Content Delivery · Algorithms and Data Compression
