# A Computational Model for Tensor Core Units

**Authors:** Rezaul Chowdhury, Francesco Silvestri, Flavio Vella

arXiv: 1908.06649 · 2020-07-10

## TL;DR

This paper introduces the TCU computational model to represent the capabilities of tensor core hardware, enabling the design of efficient algorithms for matrix operations, graph algorithms, and more, with connections to external memory models.

## Contribution

The paper proposes the TCU model that captures small matrix multiplication capabilities of tensor core hardware, facilitating new algorithm designs across various computational problems.

## Key findings

- Designed fast algorithms for matrix multiplication and Gaussian elimination.
- Extended the TCU model to graph algorithms and Fourier transforms.
- Established a relation between the TCU model and external memory models.

## Abstract

To respond to the need of efficient training and inference of deep neural networks, a plethora of domain-specific hardware architectures have been introduced, such as Google Tensor Processing Units and NVIDIA Tensor Cores. A common feature of these architectures is a hardware circuit for efficiently computing a dense matrix multiplication of a given small size. In order to broaden the class of algorithms that exploit these systems, we propose a computational model, named the TCU model, that captures the ability to natively multiply small matrices. We then use the TCU model for designing fast algorithms for several problems, including matrix operations (dense and sparse multiplication, Gaussian Elimination), graph algorithms (transitive closure, all pairs shortest distances), Discrete Fourier Transform, stencil computations, integer multiplication, and polynomial evaluation. We finally highlight a relation between the TCU model and the external memory model.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.06649/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1908.06649/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1908.06649/full.md

---
Source: https://tomesphere.com/paper/1908.06649