CMDS: Cross-layer Dataflow Optimization for DNN Accelerators Exploiting Multi-bank Memories
Man Shi, Steven Colleman, Charlotte VanDeMieroop, Antony Joseph,, Maurice Meijer, Wim Dehaene, Marian Verhelst

TL;DR
This paper introduces CMDS, a cross-layer dataflow optimization framework for DNN accelerators that leverages multi-bank memories to reduce energy and latency by considering data layout dependencies across layers.
Contribution
The work presents a novel cross-layer dataflow scheduler that accounts for data layout reshuffling and memory bank exploitation, improving efficiency over layer-optimized approaches.
Findings
Up to 5.5X energy reduction compared to state-of-the-art.
Up to 1.35X latency reduction achieved.
Negligible hardware cost for the proposed optimizations.
Abstract
Deep neural networks (DNN) use a wide range of network topologies to achieve high accuracy within diverse applications. This model diversity makes it impossible to identify a single "dataflow" (execution schedule) to perform optimally across all possible layers and network topologies. Several frameworks support the exploration of the best dataflow for a given DNN layer and hardware. However, switching the dataflow from one layer to the next layer within one DNN model can result in hardware inefficiencies stemming from memory data layout mismatch among the layers. Unfortunately, all existing frameworks treat each layer independently and typically model memories as black boxes (one large monolithic wide memory), which ignores the data layout and can not deal with the data layout dependencies of sequential layers. These frameworks are not capable of doing dataflow cross-layer optimization.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
