ACDC: A Structured Efficient Linear Layer
Marcin Moczulski, Misha Denil, Jeremy Appleyard, Nando de Freitas

TL;DR
This paper introduces ACDC, a structured linear layer that reduces parameters and computational costs using diagonal matrices and the discrete cosine transform, enabling efficient deep learning models.
Contribution
The paper presents a novel ACDC module with $O(N)$ parameters and $O(N \, log N)$ operations, and demonstrates its effectiveness in deep neural networks for image recognition.
Findings
ACDC approximates linear layers effectively.
Interleaving ACDC with ReLU modules maintains performance.
Training factors like initialization and depth are critical.
Abstract
The linear layer is one of the most pervasive modules in deep learning representations. However, it requires parameters and operations. These costs can be prohibitive in mobile applications or prevent scaling in many domains. Here, we introduce a deep, differentiable, fully-connected neural network module composed of diagonal matrices of parameters, and , and the discrete cosine transform . The core module, structured as , has parameters and incurs operations. We present theoretical results showing how deep cascades of ACDC layers approximate linear layers. ACDC is, however, a stand-alone module and can be used in combination with any other types of module. In our experiments, we show that it can indeed be successfully interleaved with ReLU modules in convolutional neural networks for image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Image Enhancement Techniques · Face and Expression Recognition
MethodsLinear Layer · *Communicated@Fast*How Do I Communicate to Expedia?
