Reduce Computational Complexity for Convolutional Layers by Skipping   Zeros

Zhiyi Zhang; Pengfei Zhang; Zhuopin Xu; Qi Wang

arXiv:2306.15951·cs.LG·August 27, 2024·1 cites

Reduce Computational Complexity for Convolutional Layers by Skipping Zeros

Zhiyi Zhang, Pengfei Zhang, Zhuopin Xu, Qi Wang

PDF

Open Access

TL;DR

This paper introduces the C-K-S algorithm that reduces computational complexity in convolutional neural networks by skipping zeros during tensor operations, leading to faster and more efficient processing.

Contribution

The paper presents the C-K-S algorithm, a novel method for eliminating zero-padding in convolutional layers, improving speed and hardware efficiency in CNN computations.

Findings

01

C-K-S outperforms PyTorch and cuDNN in speed and convergence in certain scenarios.

02

C-K-S effectively trims filters and transforms sparse tensors, reducing redundant calculations.

03

Experimental results validate the efficiency and effectiveness of the proposed method.

Abstract

Convolutional neural networks necessitate good algorithms to reduce complexity, and sufficient utilization of parallel processors for acceleration. Within convolutional layers, there are three types of operators: convolution used in forward propagation, deconvolution and dilated-convolution utilized in backward propagation. During the execution of these operators, zeros are typically added to tensors, leading to redundant calculations and unnecessary strain on hardware. To circumvent these inefficiencies, we propose the C-K-S algorithm, accompanied by efficient GPU implementations. C-K-S trims filters to exclude zero-padding. For deconvolution and dilated-convolution, C-K-S transforms sparse tensors into dense tensors, and standardizes the local computational rules to simplify the hardware control. The experimental results demonstrate that C-K-S offers good performance in terms of speed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Image Enhancement Techniques · Digital Filter Design and Implementation

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution