Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native
Yao Lu, Song Bian, Lequn Chen, Yongjun He, Yulong Hui, Matthew Lentz,, Beibin Li, Fei Liu, Jialin Li, Qi Liu, Rui Liu, Xiaoxuan Liu, Lin Ma, Kexin, Rong, Jianguo Wang, Yingjun Wu, Yongji Wu, Huanchen Zhang, Minjia Zhang,, Qizhen Zhang, Tianyi Zhou, Danyang Zhuo

TL;DR
This paper explores the integration of large generative AI models with cloud-native architectures, proposing an AI-native paradigm that leverages cloud technologies and advanced ML runtimes to optimize costs and resource access.
Contribution
It introduces an AI-native computing paradigm combining cloud-native tech and ML runtimes, addressing challenges of large models like cost and resource efficiency.
Findings
Proposes an AI-native paradigm for large models
Highlights benefits of cloud-native integration
Encourages future research in AI-native computing
Abstract
In this paper, we investigate the intersection of large generative AI models and cloud-native computing architectures. Recent large models such as ChatGPT, while revolutionary in their capabilities, face challenges like escalating costs and demand for high-end GPUs. Drawing analogies between large-model-as-a-service (LMaaS) and cloud database-as-a-service (DBaaS), we describe an AI-native computing paradigm that harnesses the power of both cloud-native technologies (e.g., multi-tenancy and serverless computing) and advanced machine learning runtime (e.g., batched LoRA inference). These joint efforts aim to optimize costs-of-goods-sold (COGS) and improve resource accessibility. The journey of merging these two domains is just at the beginning and we hope to stimulate future research and development in this area.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Scientific Computing and Data Management · Cloud Computing and Resource Management
