ELK: Exploring the Efficiency of Inter-core Connected AI Chips with Deep Learning Compiler Techniques
Yiqi Liu, Yuqi Xue, Noelle Crawford, Jilong Xue, Jian Huang

TL;DR
This paper introduces Elk, a deep learning compiler framework that optimizes the performance of inter-core connected AI chips by balancing compute, communication, and I/O, leading to near-ideal efficiency.
Contribution
Elk is the first compiler framework to systematically explore and optimize the trade-offs among compute, communication, and I/O for ICCA chips, enhancing their efficiency.
Findings
Achieves 94% of ideal roofline performance on ICCA chips.
Enables effective architecture design space exploration.
Supports large deep learning models efficiently.
Abstract
To meet the increasing demand of deep learning (DL) models, AI chips are employing both off-chip memory (e.g., HBM) and high-bandwidth low-latency interconnect for direct inter-core data exchange. However, it is not easy to explore the efficiency of these inter-core connected AI (ICCA) chips, due to a fundamental tussle among compute (per-core execution), communication (inter-core data exchange), and I/O (off-chip data access). In this paper, we develop Elk, a DL compiler framework to maximize the efficiency of ICCA chips by jointly trading off all the three performance factors discussed above. Elk structures these performance factors into configurable parameters and forms a global trade-off space in the DL compiler. To systematically explore this space and maximize overall efficiency, Elk employs a new inductive operator scheduling policy and a cost-aware on-chip memory allocation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Advanced Memory and Neural Computing
