CiMRAG: CiM-Aware Domain-Adaptive and Noise-Resilient Retrieval-Augmented Generation for Edge-Based LLMs
Shih-Hsuan Chiu, Ming-Syan Chen

TL;DR
This paper introduces CiMRAG, a retrieval-augmented generation framework optimized for edge devices that enhances noise resilience and domain adaptability using a novel embedding learning method called TONEL.
Contribution
The paper presents TONEL, a new noise-aware embedding learning framework that improves retrieval accuracy in noisy, multi-domain edge environments for RAG systems.
Findings
TONEL significantly improves retrieval accuracy under noisy conditions.
The proposed method outperforms strong baselines on personalization benchmarks.
It demonstrates practical effectiveness for edge-based LLM applications.
Abstract
Personalized virtual assistants powered by large language models (LLMs) on edge devices are attracting growing attention, with Retrieval-Augmented Generation (RAG) emerging as a key method for personalization by retrieving relevant profile data and generating tailored responses. However, deploying RAG on edge devices faces efficiency hurdles due to the rapid growth of profile data, such as user-LLM interactions and recent updates. While Computing-in-Memory (CiM) architectures mitigate this bottleneck by eliminating data movement between memory and processing units via in-situ operations, they are susceptible to environmental noise that can degrade retrieval precision. This poses a critical issue in dynamic, multi-domain edge-based scenarios (e.g., travel, medicine, and law) where both accuracy and adaptability are paramount. To address these challenges, we propose Task-Oriented…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
