Efficient Multi-user Offloading of Personalized Diffusion Models: A DRL-Convex Hybrid Solution
Wanting Yang, Zehui Xiong, Song Guo, Shiwen Mao, Dong In Kim, Merouane, Debbah

TL;DR
This paper introduces a hybrid DRL-convex optimization framework for efficient multi-user offloading of personalized diffusion models, balancing latency and accuracy on edge devices with diverse resources.
Contribution
It proposes a novel multi-user hybrid inference method and formulates the offloading problem as a GQAP extension, solved via a DRL-convex hybrid approach.
Findings
Outperforms traditional methods in optimality and complexity
Effectively balances latency and accuracy for multiple users
Reduces storage and computational burden on edge servers
Abstract
With the impressive generative capabilities of diffusion models, personalized content synthesis has emerged as the most highly anticipated. However, the large model sizes and iterative nature of inference make it difficult to deploy personalized diffusion models broadly on local devices with varying computational power. To this end, we propose a novel framework for efficient multi-user offloading of personalized diffusion models, given a variable number of users, diverse user computational capabilities, and fluctuating available computational resources on the edge server. To enhance computational efficiency and reduce storage burden on edge servers, we first propose a tailored multi-user hybrid inference manner, where the inference process for each user is split into two phases with an optimizable split point. The initial phase of inference is processed on a cluster-wide model using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPeer-to-Peer Network Technologies · Caching and Content Delivery
