On Vectorization of Deep Convolutional Neural Networks for Vision Tasks
Jimmy SJ. Ren, Li Xu

TL;DR
This paper investigates the vectorization process in deep CNNs for vision tasks, aiming to enhance parallel implementation and speed, by analyzing key components and providing a unified framework with efficient Matlab code.
Contribution
It offers a detailed study of vectorization in CNNs, introduces six implementations to compare vectorization effects, and provides a unified framework with a high-performance Matlab implementation.
Findings
Vectorization significantly improves training and testing speed.
Six implementations demonstrate the impact of different vectorization levels.
A unified framework supports various vision tasks efficiently.
Abstract
We recently have witnessed many ground-breaking results in machine learning and computer vision, generated by using deep convolutional neural networks (CNN). While the success mainly stems from the large volume of training data and the deep network architectures, the vector processing hardware (e.g. GPU) undisputedly plays a vital role in modern CNN implementations to support massive computation. Though much attention was paid in the extent literature to understand the algorithmic side of deep CNN, little research was dedicated to the vectorization for scaling up CNNs. In this paper, we studied the vectorization process of key building blocks in deep CNNs, in order to better understand and facilitate parallel implementation. Key steps in training and testing deep CNNs are abstracted as matrix and vector operators, upon which parallelism can be easily achieved. We developed and compared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
