LoopTree: Exploring the Fused-layer Dataflow Accelerator Design Space

Michael Gilbert; Yannan Nellie Wu; Joel S. Emer; Vivienne Sze

arXiv:2409.13625·cs.AR·October 15, 2024

LoopTree: Exploring the Fused-layer Dataflow Accelerator Design Space

Michael Gilbert, Yannan Nellie Wu, Joel S. Emer, Vivienne Sze

PDF

1 Repo

TL;DR

LoopTree systematically explores an expanded fused-layer dataflow design space for DNN accelerators, enabling more efficient trade-offs between buffer capacity, energy, and latency, leading to improved accelerator designs.

Contribution

It introduces a comprehensive design space, a taxonomy, and a model for evaluating fused-layer dataflow accelerators, surpassing prior limited explorations.

Findings

01

Achieves up to 10× buffer capacity reduction for same off-chip transfers.

02

Model validated with 4% worst-case error against prior architectures.

03

Exploration of larger space yields more efficient accelerator designs.

Abstract

Latency and energy consumption are key metrics in the performance of deep neural network (DNN) accelerators. A significant factor contributing to latency and energy is data transfers. One method to reduce transfers or data is reusing data when multiple operations use the same data. Fused-layer accelerators reuse data across operations in different layers by retaining intermediate data in on-chip buffers, which has been shown to reduce energy consumption and latency. Moreover, the intermediate data is often tiled (i.e., broken into chunks) to reduce the on-chip buffer capacity required to reuse the data. Because on-chip buffer capacity is frequently more limited than computation units, fused-layer dataflow accelerators may also recompute certain parts of the intermediate data instead of retaining them in a buffer. Achieving efficient trade-offs between on-chip buffer capacity, off-chip…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

accelergy-project/looptree-tutorial
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.