Joint Model Assignment and Resource Allocation for Cost-Effective Mobile Generative Services
Shuangwei Gao, Peng Yang, Yuxin Kong, Feng Lyu, and Ning Zhang

TL;DR
This paper proposes an edge-enabled system for assigning generative AI models and resources to mobile users, improving content quality and reducing latency through probabilistic model assignment and adaptive resource allocation.
Contribution
It introduces a novel probabilistic model assignment approach and a heuristic algorithm for dynamic resource allocation in edge-based AIGC services.
Findings
Content quality improved by up to 4.7%.
Response delay reduced by up to 39.1%.
Effective system for mobile AIGC services.
Abstract
Artificial Intelligence Generated Content (AIGC) services can efficiently satisfy user-specified content creation demands, but the high computational requirements pose various challenges to supporting mobile users at scale. In this paper, we present our design of an edge-enabled AIGC service provisioning system to properly assign computing tasks of generative models to edge servers, thereby improving overall user experience and reducing content generation latency. Specifically, once the edge server receives user requested task prompts, it dynamically assigns appropriate models and allocates computing resources based on features of each category of prompts. The generated contents are then delivered to users. The key to this system is a proposed probabilistic model assignment approach, which estimates the quality score of generated contents for each prompt based on category labels. Next,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransportation and Mobility Innovations · Service-Oriented Architecture and Web Services · Sharing Economy and Platforms
Methodstravel james
