Towards Fast Setup and High Throughput of GPU Serverless Computing
Han Zhao, Weihao Cui, Quan Chen, Shulai Zhang, Zijun Li, Jingwen Leng,, Chao Li, Deze Zeng, Minyi Guo

TL;DR
This paper introduces SAGE, a GPU serverless framework that significantly reduces setup time and increases throughput by parallelizing data preparation and sharing memory across function invocations.
Contribution
SAGE presents novel parallelized setup and memory sharing mechanisms to enhance GPU serverless computing efficiency and throughput.
Findings
Reduces function duration by 11.3 times
Improves function density by 1.22 times
Outperforms state-of-the-art platforms in experiments
Abstract
Integrating GPUs into serverless computing platforms is crucial for improving efficiency. However, existing solutions for GPU-enabled serverless computing platforms face two significant problems due to coarse-grained GPU management: long setup time and low function throughput. To address these issues, we propose SAGE, a GPU serverless framework with fast setup and high throughput. First, based on the data knowability of GPU function ahead of actual execution, SAGE first devises the parallelized function setup mechanism, which parallelizes the data preparation and context creation. In this way, SAGE achieves fast setup of GPU function invocations.Second, SAGE further proposes the sharing-based memory management mechanism, which shares the read-only memory and context memory across multiple invocations of the same function. The memory sharing mechanism avoids repeated data preparation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Cloud Computing and Resource Management · Parallel Computing and Optimization Techniques
