HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights
Ozan Gokdemir, Carlo Siebenschuh, Alexander Brace, Azton Wells, Brian, Hsu, Kyle Hippe, Priyanka V. Setty, Aswathy Ajith, J. Gregory Pauloski,, Varuni Sastry, Sam Foreman, Huihuo Zheng, Heng Ma, Bharat Kale, Nicholas, Chia, Thomas Gibbs, Michael E. Papka, Thomas Brettin

TL;DR
HiPerRAG leverages high-performance computing to enable scalable retrieval-augmented generation from over 3.6 million scientific articles, significantly improving scientific question answering accuracy and facilitating cross-disciplinary research.
Contribution
The paper introduces HiPerRAG, a novel HPC-powered RAG workflow with new models Oreo and ColTrast, designed to handle large-scale scientific knowledge retrieval efficiently.
Findings
Achieves 90% accuracy on SciQ benchmark
Attains 76% accuracy on PubMedQA benchmark
Outperforms domain-specific models and GPT-4 in scientific QA
Abstract
The volume of scientific literature is growing exponentially, leading to underutilized discoveries, duplicated efforts, and limited cross-disciplinary collaboration. Retrieval Augmented Generation (RAG) offers a way to assist scientists by improving the factuality of Large Language Models (LLMs) in processing this influx of information. However, scaling RAG to handle millions of articles introduces significant challenges, including the high computational costs associated with parsing documents and embedding scientific knowledge, as well as the algorithmic complexity of aligning these representations with the nuanced semantics of scientific content. To address these issues, we introduce HiPerRAG, a RAG workflow powered by high performance computing (HPC) to index and retrieve knowledge from more than 3.6 million scientific articles. At its core are Oreo, a high-throughput model for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Attention Dropout · Softmax · Absolute Position Encodings
