Theseus: Exploring Efficient Wafer-Scale Chip Design for Large Language Models
Jingchen Zhu, Chenhao Xue, Yiqi Chen, Zhao Wang, Chen Zhang, Yu Shen,, Yifan Chen, Zekang Cheng, Yu Jiang, Tianqi Wang, Yibo Lin, Wei Hu, Bin Cui,, Runsheng Wang, Yun Liang, Guangyu Sun

TL;DR
This paper introduces Theseus, a framework for efficiently exploring wafer-scale chip designs tailored for large language models, significantly improving performance and power efficiency over existing solutions.
Contribution
The paper presents a novel exploration framework that combines a comprehensive WSC design space with multi-fidelity Bayesian optimization for LLMs, enabling efficient identification of optimal designs.
Findings
Achieves up to 62.8% performance improvement over GPU clusters.
Reduces power consumption by up to 42.4% compared to existing WSCs.
Enhances inference performance by up to 23.2 times.
Abstract
The emergence of the large language model~(LLM) poses an exponential growth of demand for computation throughput, memory capacity, and communication bandwidth. Such a demand growth has significantly surpassed the improvement of corresponding chip designs. With the advancement of fabrication and integration technologies, designers have been developing Wafer-Scale Chips~(WSCs) to scale up and exploit the limits of computation density, memory capacity, and communication bandwidth at the level of a single chip. Existing solutions have demonstrated the significant advantages of WSCs over traditional designs, showing potential to effectively support LLM workloads. Despite the benefits, exploring the early-stage design space of WSCs for LLMs is a crucial yet challenging task due to the enormous and complicated design space, time-consuming evaluation methods, and inefficient exploration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques · Modular Robots and Swarm Intelligence · VLSI and FPGA Design Techniques
