Hardware Acceleration of Sparse and Irregular Tensor Computations of ML   Models: A Survey and Insights

Shail Dave; Riyadh Baghdadi; Tony Nowatzki; Sasikanth Avancha; Aviral; Shrivastava; Baoxin Li

arXiv:2007.00864·cs.AR·August 11, 2021

Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights

Shail Dave, Riyadh Baghdadi, Tony Nowatzki, Sasikanth Avancha, Aviral, Shrivastava, Baoxin Li

PDF

TL;DR

This survey reviews hardware acceleration techniques for sparse and irregular tensor computations in machine learning, analyzing design trade-offs, recent trends, and opportunities for optimization in hardware/software co-design.

Contribution

It provides a comprehensive categorization and analysis of hardware designs and acceleration methods for sparse, irregular tensors in ML models, highlighting recent trends and future opportunities.

Findings

01

Analysis of hardware and execution costs for various designs

02

Identification of key challenges in accelerating sparse tensors

03

Insights into recent hardware design trends and optimization opportunities

Abstract

Machine learning (ML) models are widely used in many important domains. For efficiently processing these computational- and memory-intensive applications, tensors of these over-parameterized models are compressed by leveraging sparsity, size reduction, and quantization of tensors. Unstructured sparsity and tensors with varying dimensions yield irregular computation, communication, and memory access patterns; processing them on hardware accelerators in a conventional manner does not inherently leverage acceleration opportunities. This paper provides a comprehensive survey on the efficient execution of sparse and irregular tensor computations of ML models on hardware accelerators. In particular, it discusses enhancement modules in the architecture design and the software support; categorizes different hardware designs and acceleration techniques and analyzes them in terms of hardware and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.