Recall: Empowering Multimodal Embedding for Edge Devices
Dongqi Cai, Shangguang Wang, Chen Peng, Zeling Zhang, Mengwei Xu

TL;DR
RECALL is a new on-device multimodal embedding system designed for mobile devices that balances high retrieval accuracy and throughput with minimal resource consumption, enabling efficient memory recall on resource-limited hardware.
Contribution
The paper introduces RECALL, a resource-efficient multimodal embedding system optimized for mobile devices, combining coarse embeddings and query filtering for improved performance.
Findings
High-throughput retrieval with minimal memory use
Accurate multimodal embeddings suitable for mobile devices
Operates with low energy consumption
Abstract
Human memory is inherently prone to forgetting. To address this, multimodal embedding models have been introduced, which transform diverse real-world data into a unified embedding space. These embeddings can be retrieved efficiently, aiding mobile users in recalling past information. However, as model complexity grows, so do its resource demands, leading to reduced throughput and heavy computational requirements that limit mobile device implementation. In this paper, we introduce RECALL, a novel on-device multimodal embedding system optimized for resource-limited mobile environments. RECALL achieves high-throughput, accurate retrieval by generating coarse-grained embeddings and leveraging query-based filtering for refined retrieval. Experimental results demonstrate that RECALL delivers high-quality embeddings with superior throughput, all while operating unobtrusively with minimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTeam Dynamics and Performance · Personal Information Management and User Behavior · Speech and dialogue systems
