Capstan: A Vector RDA for Sparsity

Alexander Rucker; Matthew Vilim; Tian Zhao; Yaqi Zhang; Raghu; Prabhakar; Kunle Olukotun

arXiv:2104.12760·cs.AR·September 24, 2021

Capstan: A Vector RDA for Sparsity

Alexander Rucker, Matthew Vilim, Tian Zhao, Yaqi Zhang, Raghu, Prabhakar, Kunle Olukotun

PDF

TL;DR

Capstan is a scalable, reconfigurable dataflow accelerator that efficiently handles sparse and dense tensor applications, significantly outperforming CPUs and GPUs through optimized memory and parallel processing.

Contribution

It introduces a flexible, application-independent RDA design supporting common sparse formats with high-performance vectorized hardware and optimized memory access.

Findings

01

Capstan with DDR4 is 18x faster than multi-core CPU.

02

Capstan with HBM2 is 16x faster than Nvidia V100 GPU.

03

Capstan is 7.6x to 365x faster than Plasticine for certain applications.

Abstract

This paper proposes Capstan: a scalable, parallel-patterns-based, reconfigurable dataflow accelerator (RDA) for sparse and dense tensor applications. Instead of designing for one application, we start with common sparse data formats, each of which supports multiple applications. Using a declarative programming model, Capstan supports application-independent sparse iteration and memory primitives that can be mapped to vectorized, high-performance hardware. We optimize random-access sparse memories with configurable out-of-order execution to increase SRAM random-access throughput from 32% to 80%. For a variety of sparse applications, Capstan with DDR4 memory is 18x faster than a multi-core CPU baseline, while Capstan with HBM2 memory is 16x faster than an Nvidia V100 GPU. For sparse applications that can be mapped to Plasticine, a recent dense RDA, Capstan is 7.6x to 365x faster and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.