Efficient Multi-user Offloading of Personalized Diffusion Models: A   DRL-Convex Hybrid Solution

Wanting Yang; Zehui Xiong; Song Guo; Shiwen Mao; Dong In Kim; Merouane; Debbah

arXiv:2411.15781·cs.NI·March 4, 2025

Efficient Multi-user Offloading of Personalized Diffusion Models: A DRL-Convex Hybrid Solution

Wanting Yang, Zehui Xiong, Song Guo, Shiwen Mao, Dong In Kim, Merouane, Debbah

PDF

Open Access

TL;DR

This paper introduces a hybrid DRL-convex optimization framework for efficient multi-user offloading of personalized diffusion models, balancing latency and accuracy on edge devices with diverse resources.

Contribution

It proposes a novel multi-user hybrid inference method and formulates the offloading problem as a GQAP extension, solved via a DRL-convex hybrid approach.

Findings

01

Outperforms traditional methods in optimality and complexity

02

Effectively balances latency and accuracy for multiple users

03

Reduces storage and computational burden on edge servers

Abstract

With the impressive generative capabilities of diffusion models, personalized content synthesis has emerged as the most highly anticipated. However, the large model sizes and iterative nature of inference make it difficult to deploy personalized diffusion models broadly on local devices with varying computational power. To this end, we propose a novel framework for efficient multi-user offloading of personalized diffusion models, given a variable number of users, diverse user computational capabilities, and fluctuating available computational resources on the edge server. To enhance computational efficiency and reduce storage burden on edge servers, we first propose a tailored multi-user hybrid inference manner, where the inference process for each user is split into two phases with an optimizable split point. The initial phase of inference is processed on a cluster-wide model using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPeer-to-Peer Network Technologies · Caching and Content Delivery