RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge
Ahmed Bin Khalid

TL;DR
RAGdb introduces a portable, zero-dependency architecture for multimodal retrieval-augmented generation that operates efficiently on edge devices, reducing infrastructure complexity and enhancing privacy.
Contribution
It presents a monolithic, SQLite-based system with a deterministic hybrid scoring function, eliminating GPU reliance and enabling efficient, local-first RAG on edge hardware.
Findings
Achieves 100% Recall@1 for entity retrieval
31.6x faster incremental ingestion on a consumer laptop
Reduces disk footprint by 99.5% compared to Docker stacks
Abstract
Retrieval-Augmented Generation (RAG) has established itself as the standard paradigm for grounding Large Language Models (LLMs) in domain-specific, up-to-date data. However, the prevailing architecture for RAG has evolved into a complex, distributed stack requiring cloud-hosted vector databases, heavy deep learning frameworks (e.g., PyTorch, CUDA), and high-latency embedding inference servers. This ``infrastructure bloat'' creates a significant barrier to entry for edge computing, air-gapped environments, and privacy-constrained applications where data sovereignty is paramount. This paper introduces RAGdb, a novel monolithic architecture that consolidates automated multimodal ingestion, ONNX-based extraction, and hybrid vector retrieval into a single, portable SQLite container. We propose a deterministic Hybrid Scoring Function (HSF) that combines sublinear TF-IDF vectorization with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Graph Theory and Algorithms · Advanced Graph Neural Networks
