A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models
Cong Guo, Feng Cheng, Zhixu Du, James Kiessling, Jonathan Ku, Shiyu, Li, Ziru Li, Mingyuan Ma, Tergel Molom-Ochir, Benjamin Morris, Haoxuan Shan,, Jingwei Sun, Yitu Wang, Chiyue Wei, Xueying Wu, Yuhao Wu, Hao Frank Yang,, Jingyang Zhang, Junyao Zhang, Qilin Zheng, Guanglei Zhou

TL;DR
This survey reviews the co-design of hardware and software tailored for large language models, highlighting challenges, innovations, and future directions to optimize their development and deployment.
Contribution
It provides a comprehensive overview of system-level optimization strategies and co-design approaches specifically for large language models, guiding future research.
Findings
Analysis of hardware and software challenges in LLM deployment
Summary of existing co-design approaches for LLMs
Identification of future research directions in LLM system optimization
Abstract
The rapid development of large language models (LLMs) has significantly transformed the field of artificial intelligence, demonstrating remarkable capabilities in natural language processing and moving towards multi-modal functionality. These models are increasingly integrated into diverse applications, impacting both research and industry. However, their development and deployment present substantial challenges, including the need for extensive computational resources, high energy consumption, and complex software optimizations. Unlike traditional deep learning systems, LLMs require unique optimization strategies for training and inference, focusing on system-level efficiency. This paper surveys hardware and software co-design approaches specifically tailored to address the unique characteristics and constraints of large language models. This survey analyzes the challenges and impacts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Distributed and Parallel Computing Systems
