# TensorFlow Doing HPC

**Authors:** Steven W. D. Chien, Stefano Markidis, Vyacheslav Olshevsky, Yaroslav, Bulatov, Erwin Laure, Jeffrey S. Vetter

arXiv: 1903.04364 · 2020-03-03

## TL;DR

This paper evaluates TensorFlow's performance on HPC workloads, demonstrating its potential as a framework for heterogeneous supercomputers through benchmark tests on traditional HPC applications.

## Contribution

It provides the first comprehensive performance analysis of TensorFlow on HPC applications, highlighting its ability to utilize high-performance networks and accelerators effectively.

## Key findings

- TensorFlow achieves over 50% of theoretical communication bandwidth.
- Performance improvements of 1.7x to 2x with additional GPUs.
- TensorFlow shows high potential as an HPC programming framework.

## Abstract

TensorFlow is a popular emerging open-source programming framework supporting the execution of distributed applications on heterogeneous hardware. While TensorFlow has been initially designed for developing Machine Learning (ML) applications, in fact TensorFlow aims at supporting the development of a much broader range of application kinds that are outside the ML domain and can possibly include HPC applications. However, very few experiments have been conducted to evaluate TensorFlow performance when running HPC workloads on supercomputers. This work addresses this lack by designing four traditional HPC benchmark applications: STREAM, matrix-matrix multiply, Conjugate Gradient (CG) solver and Fast Fourier Transform (FFT). We analyze their performance on two supercomputers with accelerators and evaluate the potential of TensorFlow for developing HPC applications. Our tests show that TensorFlow can fully take advantage of high performance networks and accelerators on supercomputers. Running our TensorFlow STREAM benchmark, we obtain over 50% of theoretical communication bandwidth on our testing platform. We find an approximately 2x, 1.7x and 1.8x performance improvement when increasing the number of GPUs from two to four in the matrix-matrix multiply, CG and FFT applications respectively. All our performance results demonstrate that TensorFlow has high potential of emerging also as HPC programming framework for heterogeneous supercomputers.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.04364/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1903.04364/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1903.04364/full.md

---
Source: https://tomesphere.com/paper/1903.04364