Learning Fast Algorithms for Linear Transforms Using Butterfly   Factorizations

Tri Dao; Albert Gu; Matthew Eichhorn; Atri Rudra; Christopher R\'e

arXiv:1903.05895·cs.LG·January 1, 2021·31 cites

Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations

Tri Dao, Albert Gu, Matthew Eichhorn, Atri Rudra, Christopher R\'e

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to automatically learn fast algorithms for structured linear transforms, such as Fourier transforms, using butterfly factorizations, achieving near-optimal efficiency and improved performance in machine learning tasks.

Contribution

It presents a parameterization that can automatically discover efficient divide-and-conquer algorithms for structured transforms, including the FFT, without manual hand-crafting.

Findings

01

Recovers the O(N log N) FFT algorithm to machine precision

02

Achieves 4X faster inference speed in neural network compression

03

Reduces parameters by 40X compared to unstructured matrices

Abstract

Fast linear transforms are ubiquitous in machine learning, including the discrete Fourier transform, discrete cosine transform, and other structured transformations such as convolutions. All of these transforms can be represented by dense matrix-vector multiplication, yet each has a specialized and highly efficient (subquadratic) algorithm. We ask to what extent hand-crafting these algorithms and implementations is necessary, what structural priors they encode, and how much knowledge is required to automatically learn a fast algorithm for a provided structured transform. Motivated by a characterization of fast matrix-vector multiplication as products of sparse matrices, we introduce a parameterization of divide-and-conquer methods that is capable of representing a large class of transforms. This generic formulation can automatically learn an efficient algorithm for many important…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HazyResearch/butterfly
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Sparse and Compressive Sensing Techniques · Parallel Computing and Optimization Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings