Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions
Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya, Goyal, Zachary DeVito, William S. Moses, Sven Verdoolaege, Andrew Adams,, Albert Cohen

TL;DR
Tensor Comprehensions introduces a framework-agnostic, high-performance compilation system for deep learning models, enabling efficient operator implementation and optimization across various hardware platforms.
Contribution
It presents a new mathematical language and a JIT compiler that optimize deep learning computations with operator fusion and size-specific tuning.
Findings
Achieves high-performance execution of deep learning models
Enables custom operator development with reduced engineering effort
Provides significant speedups through autotuning and optimization
Abstract
Deep learning models with convolutional and recurrent networks are now ubiquitous and analyze massive amounts of audio, image, video, text and graph data, with applications in automatic translation, speech-to-text, scene understanding, ranking user preferences, ad placement, etc. Competing frameworks for building these networks such as TensorFlow, Chainer, CNTK, Torch/PyTorch, Caffe1/2, MXNet and Theano, explore different tradeoffs between usability and expressiveness, research or production orientation and supported hardware. They operate on a DAG of computational operators, wrapping high-performance libraries such as CUDNN (for NVIDIA GPUs) or NNPACK (for various CPUs), and automate memory allocation, synchronization, distribution. Custom operators are needed where the computation does not fit existing high-performance library calls, usually at a high engineering cost. This is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Parallel Computing and Optimization Techniques · Tensor decomposition and applications
