ALF: Autoencoder-based Low-rank Filter-sharing for Efficient Convolutional Neural Networks
Alexander Frickenstein, Manoj-Rohit Vemparala, Nael Fasfous, Laura, Hauenschild, Naveen-Shankar Nagaraja, Christian Unger, Walter Stechele

TL;DR
This paper introduces ALF, an autoencoder-based low-rank filter-sharing method that significantly compresses CNNs, reducing parameters and computation with minimal accuracy loss, suitable for resource-constrained embedded devices.
Contribution
ALF is a novel autoencoder-based low-rank filter-sharing technique that outperforms traditional pruning methods in CNN compression for embedded applications.
Findings
ALF reduces network parameters by 70%.
ALF decreases operations by 61%.
ALF cuts execution time by 41%.
Abstract
Closing the gap between the hardware requirements of state-of-the-art convolutional neural networks and the limited resources constraining embedded applications is the next big challenge in deep learning research. The computational complexity and memory footprint of such neural networks are typically daunting for deployment in resource constrained environments. Model compression techniques, such as pruning, are emphasized among other optimization methods for solving this problem. Most existing techniques require domain expertise or result in irregular sparse representations, which increase the burden of deploying deep learning applications on embedded hardware accelerators. In this paper, we propose the autoencoder-based low-rank filter-sharing technique technique (ALF). When applied to various networks, ALF is compared to state-of-the-art pruning methods, demonstrating its efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
