CMDS: Cross-layer Dataflow Optimization for DNN Accelerators Exploiting   Multi-bank Memories

Man Shi; Steven Colleman; Charlotte VanDeMieroop; Antony Joseph,; Maurice Meijer; Wim Dehaene; Marian Verhelst

arXiv:2406.14574·cs.AR·June 24, 2024

CMDS: Cross-layer Dataflow Optimization for DNN Accelerators Exploiting Multi-bank Memories

Man Shi, Steven Colleman, Charlotte VanDeMieroop, Antony Joseph,, Maurice Meijer, Wim Dehaene, Marian Verhelst

PDF

TL;DR

This paper introduces CMDS, a cross-layer dataflow optimization framework for DNN accelerators that leverages multi-bank memories to reduce energy and latency by considering data layout dependencies across layers.

Contribution

The work presents a novel cross-layer dataflow scheduler that accounts for data layout reshuffling and memory bank exploitation, improving efficiency over layer-optimized approaches.

Findings

01

Up to 5.5X energy reduction compared to state-of-the-art.

02

Up to 1.35X latency reduction achieved.

03

Negligible hardware cost for the proposed optimizations.

Abstract

Deep neural networks (DNN) use a wide range of network topologies to achieve high accuracy within diverse applications. This model diversity makes it impossible to identify a single "dataflow" (execution schedule) to perform optimally across all possible layers and network topologies. Several frameworks support the exploration of the best dataflow for a given DNN layer and hardware. However, switching the dataflow from one layer to the next layer within one DNN model can result in hardware inefficiencies stemming from memory data layout mismatch among the layers. Unfortunately, all existing frameworks treat each layer independently and typically model memories as black boxes (one large monolithic wide memory), which ignores the data layout and can not deal with the data layout dependencies of sequential layers. These frameworks are not capable of doing dataflow cross-layer optimization.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.