An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices
Xiaolong Ma, Wei Niu, Tianyun Zhang, Sijia Liu, Sheng Lin, Hongjia Li,, Xiang Chen, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang

TL;DR
This paper introduces a novel pattern-based sparsity method for DNN pruning that enhances accuracy and efficiency, enabling real-time inference on mobile devices through compiler optimization and pattern-aware pruning techniques.
Contribution
It proposes a new pattern-based sparsity framework combining pattern and connectivity sparsity, achieving high accuracy and hardware friendliness for mobile DNN inference.
Findings
Achieves accuracy enhancement across various DNN structures and datasets.
Enables real-time inference of large-scale DNNs on mobile devices.
Provides a comprehensive pattern-aware pruning and training framework.
Abstract
Weight pruning has been widely acknowledged as a straightforward and effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby achieving acceleration on various platforms. However, most of the pruning techniques are essentially trade-offs between model accuracy and regularity which lead to impaired inference accuracy and limited on-device acceleration performance. To solve the problem, we introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly. With carefully designed patterns, the proposed pruning unprecedentedly and consistently achieves accuracy enhancement and better feature extraction ability on different DNN structures and datasets, and our pattern-aware pruning framework also achieves pattern library extraction, pattern selection, pattern and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM
MethodsPruning
