Cooperative Inference with Interleaved Operator Partitioning for CNNs
Zhibang Liu, Chaonong Xu, Zhizhuo Liu, Lekai Huang, Jiachen Wei and, Chao Li

TL;DR
This paper introduces the Interleaved Operator Partitioning (IOP) strategy for CNNs, reducing communication delays and memory usage in cooperative inference on IoT devices by optimizing operator partitioning.
Contribution
The paper proposes a novel IOP strategy that partitions CNN operators to eliminate activation concatenation, improving inference speed and memory efficiency in cooperative deployment.
Findings
Achieves 6.39% to 16.83% faster inference acceleration.
Reduces peak memory footprint by 21.22% to 49.98%.
Outperforms existing partition methods in experiments.
Abstract
Deploying deep learning models on Internet of Things (IoT) devices often faces challenges due to limited memory resources and computing capabilities. Cooperative inference is an important method for addressing this issue, requiring the partitioning and distributive deployment of an intelligent model. To perform horizontal partitions, existing cooperative inference methods take either the output channel of operators or the height and width of feature maps as the partition dimensions. In this manner, since the activation of operators is distributed, they have to be concatenated together before being fed to the next operator, which incurs the delay for cooperative inference. In this paper, we propose the Interleaved Operator Partitioning (IOP) strategy for CNN models. By partitioning an operator based on the output channel dimension and its successive operator based on the input channel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Generative Adversarial Networks and Image Synthesis · Brain Tumor Detection and Classification
