SoMa: Identifying, Exploring, and Understanding the DRAM Communication Scheduling Space for DNN Accelerators
Jingwei Cai (1), Xuan Wang (2, 3), Mingyu Gao (1, 4, 5), Sen, Peng (2, 3), Zijian Zhu (1), Yuchen Wei (1), Zuotong Wu (2, 3), and, Kaisheng Ma (1) ((1) Tsinghua University, (2) Xi'an Jiaotong University, (3), IIISCT, (4) Shanghai AI Laboratory, (5) Shanghai Qi Zhi Institute)

TL;DR
This paper introduces SoMa, a comprehensive framework for optimizing DRAM communication scheduling in DNN accelerators by fusing multiple layers and exploring the scheduling space, leading to significant performance and energy improvements.
Contribution
We develop a tensor-centric notation and an end-to-end scheduling framework, SoMa, to optimize DRAM communication by fusing layers and exploring scheduling schemes, surpassing existing methods.
Findings
SoMa achieves 2.11x performance improvement over SOTA.
SoMa reduces energy cost by 37.3%.
Effective layer fusion and scheduling exploration enhance DNN accelerator efficiency.
Abstract
Modern Deep Neural Network (DNN) accelerators are equipped with increasingly larger on-chip buffers to provide more opportunities to alleviate the increasingly severe DRAM bandwidth pressure. However, most existing research on buffer utilization still primarily focuses on single-layer dataflow scheduling optimization. As buffers grow large enough to accommodate most single-layer weights in most networks, the impact of single-layer dataflow optimization on DRAM communication diminishes significantly. Therefore, developing new paradigms that fuse multiple layers to fully leverage the increasingly abundant on-chip buffer resources to reduce DRAM accesses has become particularly important, yet remains an open challenge. To address this challenge, we first identify the optimization opportunities in DRAM communication scheduling by analyzing the drawbacks of existing works on the layer fusion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Effects in Electronics · Energy Harvesting in Wireless Networks · Particle Detector Development and Performance
