Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine
Zuoning Zhang, Dhruv Parikh, Youning Zhang, Viktor Prasanna

TL;DR
This paper evaluates the Cerebras Wafer Scale Engine's hardware capabilities in accelerating large language models' training and inference, analyzing scalability, memory bandwidth, and performance through benchmarking and roofline modeling.
Contribution
It provides the first comprehensive benchmarking of LLMs on the Cerebras WSE, demonstrating its potential to handle memory-bound and compute-intensive NLP and CV tasks.
Findings
Cerebras WSE significantly accelerates LLM training and inference.
The system effectively mitigates the memory wall with high bandwidth memory.
Performance scales well with model size and computational intensity.
Abstract
Transformer based Large Language Models (LLMs) have recently reached state of the art performance in Natural Language Processing (NLP) and Computer Vision (CV) domains. LLMs use the Multi-Headed Self-Attention (MHSA) mechanism to capture long-range global attention relationships among input words or image patches, drastically improving its performance over prior deep learning approaches. In this paper, we evaluate the performance of LLMs on the Cerebras Wafer Scale Engine (WSE). Cerebras WSE is a high performance computing system with 2.6 trillion transistors, 850,000 cores and 40 GB on-chip memory. Cerebras WSE's Sparse Linear Algebra Compute (SLAC) cores eliminates multiply-by-zeros operations and its 40 GB of on-chip memory is uniformly distributed among SLAC cores, enabling fast local access to model parameters. Moreover, Cerebras software configures routing between cores at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
