Splitting Convolutional Neural Network Structures for Efficient Inference
Emad MalekHosseini, Mohsen Hajabdollahi, Nader Karimi, Shadrokh, Samavi, Shahram Shirani

TL;DR
This paper proposes a novel method for splitting CNN structures into smaller parts to reduce memory usage and computational load during inference, demonstrated on VGG16 and ResNet18 for CIFAR10.
Contribution
A new network splitting technique that effectively reduces memory and computation for CNN inference, improving efficiency for large input data.
Findings
Memory consumption decreased significantly with splitting.
Computational operations were reduced.
Effective on VGG16 and ResNet18 architectures.
Abstract
For convolutional neural networks (CNNs) that have a large volume of input data, memory management becomes a major concern. Memory cost reduction can be an effective way to deal with these problems that can be realized through different techniques such as feature map pruning, input data splitting, etc. Among various methods existing in this area of research, splitting the network structure is an interesting research field, and there are a few works done in this area. In this study, the problem of reducing memory utilization using network structure splitting is addressed. A new technique is proposed to split the network structure into small parts that consume lower memory than the original network. The split parts can be processed almost separately, which provides an essential role for better memory management. The split approach has been tested on two well-known network structures of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Advanced Image and Video Retrieval Techniques
