TrimCaching: Parameter-sharing AI Model Caching in Wireless Edge   Networks

Guanqiao Qu; Zheng Lin; Fangming Liu; Xianhao Chen; Kaibin Huang

arXiv:2405.03990·cs.NI·May 21, 2024·1 cites

TrimCaching: Parameter-sharing AI Model Caching in Wireless Edge Networks

Guanqiao Qu, Zheng Lin, Fangming Liu, Xianhao Chen, Kaibin Huang

PDF

Open Access

TL;DR

This paper introduces TrimCaching, a novel parameter-sharing model caching scheme for wireless edge networks that enhances storage efficiency and reduces latency by sharing model parameters across AI models.

Contribution

It formulates a parameter-sharing cache placement problem, analyzes its complexity, and develops approximation algorithms for both special and general cases.

Findings

01

Significant improvement in cache hit ratio over existing caching methods.

02

Effective algorithmic solutions with provable approximation guarantees.

03

Enhanced storage efficiency by sharing model parameters across multiple AI models.

Abstract

Next-generation mobile networks are expected to facilitate fast AI model downloading to end users. By caching models on edge servers, mobile networks can deliver models to end users with low latency, resulting in a paradigm called edge model caching. In this paper, we develop a novel model placement scheme, called parameter-sharing model caching (TrimCaching). TrimCaching exploits the key observation that a wide range of AI models, such as convolutional neural networks or large language models, can share a significant proportion of parameter blocks containing reusable knowledge, thereby improving storage efficiency. To this end, we formulate a parameter-sharing model placement problem to maximize the cache hit ratio in multi-edge wireless networks by balancing the fundamental tradeoff between storage efficiency and service latency. We show that the formulated problem is a submodular…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCaching and Content Delivery · Energy Efficient Wireless Sensor Networks · IoT and Edge/Fog Computing

Methodstravel james