Towards a RAG-based Summarization Agent for the Electron-Ion Collider
Karthik Suresh, Neeltje Kackar, Luke Schleck, Cristiano Fanelli

TL;DR
This paper presents RAGS4EIC, a retrieval-augmented summarization AI for the Electron-Ion Collider that combines vector database querying and large language models to generate concise, referenced summaries of complex experimental data, enhancing collaboration.
Contribution
The paper introduces a novel RAG-based summarization framework tailored for the EIC community, integrating prompt-tuning, LangChain, and a web app for scalable, accurate data summarization.
Findings
Effective summarization with citation referencing
Flexible prompt instruction-tuning improves accuracy
Web app demonstrates practical deployment
Abstract
The complexity and sheer volume of information encompassing documents, papers, data, and other resources from large-scale experiments demand significant time and effort to navigate, making the task of accessing and utilizing these varied forms of information daunting, particularly for new collaborators and early-career scientists. To tackle this issue, a Retrieval Augmented Generation (RAG)--based Summarization AI for EIC (RAGS4EIC) is under development. This AI-Agent not only condenses information but also effectively references relevant responses, offering substantial advantages for collaborators. Our project involves a two-step approach: first, querying a comprehensive vector database containing all pertinent experiment information; second, utilizing a Large Language Model (LLM) to generate concise summaries enriched with citations based on user queries and retrieved data. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Algorithms and Data Compression · Data Quality and Management
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Dropout · Linear Layer · Dense Connections · Adam · Layer Normalization · Weight Decay · WordPiece · Linear Warmup With Linear Decay
