Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks
Zhang Liu, Hongyang Du, Lianfen Huang, Zhibin Gao, and Dusit Niyato

TL;DR
This paper proposes a reinforcement learning approach to optimize model caching and resource allocation in wireless edge networks for generative AI, balancing quality and latency amid dynamic conditions.
Contribution
It introduces a DDPG-based method for joint model caching and resource allocation, addressing challenges of deploying large GenAI models at the edge.
Findings
Higher model hit ratio with DDPG
Lower latency in AIGC services
Superior performance over benchmark solutions
Abstract
With the rapid advancement of artificial intelligence (AI), generative AI (GenAI) has emerged as a transformative tool, enabling customized and personalized AI-generated content (AIGC) services. However, GenAI models with billions of parameters require substantial memory capacity and computational power for deployment and execution, presenting significant challenges to resource-limited edge networks. In this paper, we address the joint model caching and resource allocation problem in GenAI-enabled wireless edge networks. Our objective is to balance the trade-off between delivering high-quality AIGC and minimizing the delay in AIGC service provisioning. To tackle this problem, we employ a deep deterministic policy gradient (DDPG)-based reinforcement learning approach, capable of efficiently determining optimal model caching and resource allocation decisions for AIGC services in response…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Opportunistic and Delay-Tolerant Networks · Cooperative Communication and Network Coding
