Two-Timescale Model Caching and Resource Allocation for Edge-Enabled   AI-Generated Content Services

Zhang Liu; Hongyang Du; Xiangwang Hou; Lianfen Huang; Seyyedali; Hosseinalipour; Dusit Niyato; and Khaled Ben Letaief

arXiv:2411.01458·cs.LG·November 5, 2024

Two-Timescale Model Caching and Resource Allocation for Edge-Enabled AI-Generated Content Services

Zhang Liu, Hongyang Du, Xiangwang Hou, Lianfen Huang, Seyyedali, Hosseinalipour, Dusit Niyato, and Khaled Ben Letaief

PDF

Open Access

TL;DR

This paper proposes a two-timescale deep reinforcement learning framework for efficient model caching and resource allocation in edge-enabled AI-generated content services, balancing quality and latency.

Contribution

It introduces a novel two-timescale DRL approach combining DDQN and diffusion-based D3PG algorithms for joint caching and resource management in edge AI services.

Findings

01

The proposed T2DRL algorithm outperforms baseline methods in simulations.

02

The diffusion-based D3PG effectively manages continuous resource allocation.

03

Joint optimization improves AIGC service quality and reduces latency.

Abstract

Generative AI (GenAI) has emerged as a transformative technology, enabling customized and personalized AI-generated content (AIGC) services. In this paper, we address challenges of edge-enabled AIGC service provisioning, which remain underexplored in the literature. These services require executing GenAI models with billions of parameters, posing significant obstacles to resource-limited wireless edge. We subsequently introduce the formulation of joint model caching and resource allocation for AIGC services to balance a trade-off between AIGC quality and latency metrics. We obtain mathematical relationships of these metrics with the computational resources required by GenAI models via experimentation. Afterward, we decompose the formulation into a model caching subproblem on a long-timescale and a resource allocation subproblem on a short-timescale. Since the variables to be solved are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCaching and Content Delivery · IoT and Edge/Fog Computing · Recommender Systems and Techniques

Methodstravel james · Diffusion