From Static to Dynamic: A Streaming RAG Approach to Real-time Knowledge Base

Yuzhou Zhu

arXiv:2508.05662·cs.IR·August 11, 2025

From Static to Dynamic: A Streaming RAG Approach to Real-time Knowledge Base

Yuzhou Zhu

PDF

Open Access

TL;DR

This paper introduces Streaming RAG, a real-time knowledge base system that efficiently updates and retrieves information from streaming data sources, significantly improving speed and accuracy over static methods.

Contribution

The paper presents a novel Streaming RAG pipeline combining multi-vector screening, clustering, and filtering, with theoretical guarantees and practical efficiency for real-time knowledge retrieval.

Findings

01

Up to 3-point increase in Recall@10

02

Latency below 15 ms per query

03

Throughput exceeding 900 documents/sec

Abstract

Dynamic streams from news feeds, social media, sensor networks, and financial markets challenge static RAG frameworks. Full-scale indices incur high memory costs; periodic rebuilds introduce latency that undermines data freshness; naive sampling sacrifices semantic coverage. We present Streaming RAG, a unified pipeline that combines multi-vector cosine screening, mini-batch clustering, and a counter-based heavy-hitter filter to maintain a compact prototype set. We further prove an approximation bound $E\[R(K\_t)] \ge R^\* - L \Delta$ linking retrieval quality to clustering variance. An incremental index upsert mechanism refreshes prototypes without interrupting queries. Experiments on eight real-time streams show statistically significant gains in Recall\@10 (up to 3 points, p < 0.01), end-to-end latency below 15 ms, and throughput above 900 documents per second under a 150 MB budget.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning