MobileRAG: A Fast, Memory-Efficient, and Energy-Efficient Method for On-Device RAG
Taehwan Park, Geonho Lee, Min-Soo Kim

TL;DR
MobileRAG introduces a novel on-device retrieval-augmented generation pipeline that combines a mobile-friendly vector search and content reduction to achieve fast, memory-efficient, and privacy-preserving AI applications on mobile devices.
Contribution
It presents MobileRAG, a fully on-device RAG system using EcoVector and SCR to reduce resource usage while maintaining accuracy, enabling practical offline AI on mobile devices.
Findings
Significantly reduces latency, memory, and power consumption.
Maintains accuracy comparable to server-based RAG methods.
Enables effective offline operation for privacy-sensitive applications.
Abstract
Retrieval-Augmented Generation (RAG) has proven effective on server infrastructures, but its application on mobile devices is still underexplored due to limited memory and power resources. Existing vector search and RAG solutions largely assume abundant computation resources, making them impractical for on-device scenarios. In this paper, we propose MobileRAG, a fully on-device pipeline that overcomes these limitations by combining a mobile-friendly vector search algorithm, \textit{EcoVector}, with a lightweight \textit{Selective Content Reduction} (SCR) method. By partitioning and partially loading index data, EcoVector drastically reduces both memory footprint and CPU usage, while the SCR method filters out irrelevant text to diminish Language Model (LM) input size without degrading accuracy. Extensive experiments demonstrated that MobileRAG significantly outperforms conventional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Green IT and Sustainability · Advanced Malware Detection Techniques
