Joint Model Caching and Resource Allocation in Generative AI-Enabled   Wireless Edge Networks

Zhang Liu; Hongyang Du; Lianfen Huang; Zhibin Gao; and Dusit Niyato

arXiv:2411.08672·cs.NI·November 14, 2024

Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks

Zhang Liu, Hongyang Du, Lianfen Huang, Zhibin Gao, and Dusit Niyato

PDF

Open Access

TL;DR

This paper proposes a reinforcement learning approach to optimize model caching and resource allocation in wireless edge networks for generative AI, balancing quality and latency amid dynamic conditions.

Contribution

It introduces a DDPG-based method for joint model caching and resource allocation, addressing challenges of deploying large GenAI models at the edge.

Findings

01

Higher model hit ratio with DDPG

02

Lower latency in AIGC services

03

Superior performance over benchmark solutions

Abstract

With the rapid advancement of artificial intelligence (AI), generative AI (GenAI) has emerged as a transformative tool, enabling customized and personalized AI-generated content (AIGC) services. However, GenAI models with billions of parameters require substantial memory capacity and computational power for deployment and execution, presenting significant challenges to resource-limited edge networks. In this paper, we address the joint model caching and resource allocation problem in GenAI-enabled wireless edge networks. Our objective is to balance the trade-off between delivering high-quality AIGC and minimizing the delay in AIGC service provisioning. To tackle this problem, we employ a deep deterministic policy gradient (DDPG)-based reinforcement learning approach, capable of efficiently determining optimal model caching and resource allocation decisions for AIGC services in response…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCaching and Content Delivery · Opportunistic and Delay-Tolerant Networks · Cooperative Communication and Network Coding