TrimCaching: Parameter-sharing AI Model Caching in Wireless Edge Networks
Guanqiao Qu, Zheng Lin, Fangming Liu, Xianhao Chen, Kaibin Huang

TL;DR
This paper introduces TrimCaching, a novel parameter-sharing model caching scheme for wireless edge networks that enhances storage efficiency and reduces latency by sharing model parameters across AI models.
Contribution
It formulates a parameter-sharing cache placement problem, analyzes its complexity, and develops approximation algorithms for both special and general cases.
Findings
Significant improvement in cache hit ratio over existing caching methods.
Effective algorithmic solutions with provable approximation guarantees.
Enhanced storage efficiency by sharing model parameters across multiple AI models.
Abstract
Next-generation mobile networks are expected to facilitate fast AI model downloading to end users. By caching models on edge servers, mobile networks can deliver models to end users with low latency, resulting in a paradigm called edge model caching. In this paper, we develop a novel model placement scheme, called parameter-sharing model caching (TrimCaching). TrimCaching exploits the key observation that a wide range of AI models, such as convolutional neural networks or large language models, can share a significant proportion of parameter blocks containing reusable knowledge, thereby improving storage efficiency. To this end, we formulate a parameter-sharing model placement problem to maximize the cache hit ratio in multi-edge wireless networks by balancing the fundamental tradeoff between storage efficiency and service latency. We show that the formulated problem is a submodular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Energy Efficient Wireless Sensor Networks · IoT and Edge/Fog Computing
Methodstravel james
