A New Clustering-Based Technique for the Acceleration of Deep Convolutional Networks
Erion-Vasilis Pikoulis, Christos Mavrokefalidis, Aris S. Lalos

TL;DR
This paper introduces a clustering-based model compression technique for deep neural networks that enhances acceleration and efficiency, especially suitable for resource-limited devices, by structuring representatives to outperform traditional methods.
Contribution
The paper proposes a novel clustering approach with a structured representative design that improves acceleration gains over standard k-means in model compression.
Findings
The method achieves higher acceleration compared to conventional k-means.
Extensive evaluations validate the method's superiority across various DNN models.
Theoretical analysis confirms the potential for increased efficiency in model compression.
Abstract
Deep learning and especially the use of Deep Neural Networks (DNNs) provides impressive results in various regression and classification tasks. However, to achieve these results, there is a high demand for computing and storing resources. This becomes problematic when, for instance, real-time, mobile applications are considered, in which the involved (embedded) devices have limited resources. A common way of addressing this problem is to transform the original large pre-trained networks into new smaller models, by utilizing Model Compression and Acceleration (MCA) techniques. Within the MCA framework, we propose a clustering-based approach that is able to increase the number of employed centroids/representatives, while at the same time, have an acceleration gain compared to conventional, -means based approaches. This is achieved by imposing a special structure to the employed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
