A C++11 implementation of arbitrary-rank tensors for high-performance computing
Alejandro M. Arag\'on

TL;DR
This paper presents an efficient C++11 implementation of arbitrary-rank tensors, enabling high-performance computing with a flexible Array class and expression templates, tested on CPU and GPU.
Contribution
It introduces a versatile tensor class template with expression templates for high-performance algebraic computations in C++11.
Findings
Efficient tensor operations achieved on CPU and GPU.
Flexible and high-level tensor algebra in C++11.
Maintains performance without sacrificing abstraction.
Abstract
This article discusses an efficient implementation of tensors of arbitrary rank by using some of the idioms introduced by the recently published C++ ISO Standard (C++11). With the aims at providing a basic building block for high-performance computing, a single Array class template is carefully crafted, from which vectors, matrices, and even higher-order tensors can be created. An expression template facility is also built around the array class template to provide convenient mathematical syntax. As a result, by using templates, an extra high-level layer is added to the C++ language when dealing with algebraic objects and their operations, without compromising performance. The implementation is tested running on both CPU and GPU.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
