Large Language Models (LLMs): Deployment, Tokenomics and Sustainability

Haiwei Dong; Shuang Xie

arXiv:2405.17147·cs.MM·May 28, 2024·5 cites

Large Language Models (LLMs): Deployment, Tokenomics and Sustainability

Haiwei Dong, Shuang Xie

PDF

Open Access

TL;DR

This paper provides a comprehensive overview of the deployment strategies, economic factors, and sustainability challenges of large language models, emphasizing their operational considerations and future environmental impacts.

Contribution

It offers a detailed analysis of deployment methods, tokenomics, and sustainability issues, including quantitative assessments and future architecture visions for LLMs.

Findings

01

RAG and fine-tuning have distinct advantages and limitations.

02

Quantitative analysis of xPU requirements for training and inference.

03

Discussion on environmental carbon footprint of LLM deployment.

Abstract

The rapid advancement of Large Language Models (LLMs) has significantly impacted human-computer interaction, epitomized by the release of GPT-4o, which introduced comprehensive multi-modality capabilities. In this paper, we first explored the deployment strategies, economic considerations, and sustainability challenges associated with the state-of-the-art LLMs. More specifically, we discussed the deployment debate between Retrieval-Augmented Generation (RAG) and fine-tuning, highlighting their respective advantages and limitations. After that, we quantitatively analyzed the requirement of xPUs in training and inference. Additionally, for the tokenomics of LLM services, we examined the balance between performance and cost from the quality of experience (QoE)'s perspective of end users. Lastly, we envisioned the future hybrid architecture of LLM processing and its corresponding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Data Processing Techniques