Sparse Linear Networks with a Fixed Butterfly Structure: Theory and Practice
Nir Ailon, Omer Leibovich, Vineet Nair

TL;DR
This paper introduces a butterfly network-based architecture for neural networks that reduces the number of weights from quadratic to nearly linear, maintaining expressibility and improving training and prediction speed, supported by theoretical analysis and empirical results.
Contribution
It proposes replacing dense layers with butterfly networks in neural architectures, offering a more efficient structure with theoretical and empirical validation.
Findings
Achieves nearly linear weight complexity in neural networks.
Matches or outperforms existing architectures in NLP and vision tasks.
Offers faster training and inference without sacrificing accuracy.
Abstract
A butterfly network consists of logarithmically many layers, each with a linear number of non-zero weights (pre-specified). The fast Johnson-Lindenstrauss transform (FJLT) can be represented as a butterfly network followed by a projection onto a random subset of the coordinates. Moreover, a random matrix based on FJLT with high probability approximates the action of any matrix on a vector. Motivated by these facts, we propose to replace a dense linear layer in any neural network by an architecture based on the butterfly network. The proposed architecture significantly improves upon the quadratic number of weights required in a standard dense layer to nearly linear with little compromise in expressibility of the resulting operator. In a collection of wide variety of experiments, including supervised prediction on both the NLP and vision data, we show that this not only produces results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Video Surveillance and Tracking Methods · Sparse and Compressive Sensing Techniques
MethodsLinear Layer
