Binary-decomposed DCNN for accelerating computation and compressing   model without retraining

Ryuji Kamiya; Takayoshi Yamashita; Mitsuru Ambai; Ikuro Sato; Yuji; Yamauchi; Hironobu Fujiyoshi

arXiv:1709.04731·cs.CV·September 15, 2017·2 cites

Binary-decomposed DCNN for accelerating computation and compressing model without retraining

Ryuji Kamiya, Takayoshi Yamashita, Mitsuru Ambai, Ikuro Sato, Yuji, Yamauchi, Hironobu Fujiyoshi

PDF

Open Access

TL;DR

This paper introduces a binary decomposition method for DCNNs that accelerates inference and compresses models significantly without retraining, making deep learning more feasible on low-performance devices.

Contribution

The proposed Binary-decomposed DCNN replaces real-valued computations with binary ones, enabling faster inference and smaller models without retraining.

Findings

01

Speed increased by up to 2.07 times

02

Model size reduced by approximately 80%

03

Error rate increase limited to around 2.16%

Abstract

Recent trends show recognition accuracy increasing even more profoundly. Inference process of Deep Convolutional Neural Networks (DCNN) has a large number of parameters, requires a large amount of computation, and can be very slow. The large number of parameters also require large amounts of memory. This is resulting in increasingly long computation times and large model sizes. To implement mobile and other low performance devices incorporating DCNN, model sizes must be compressed and computation must be accelerated. To that end, this paper proposes Binary-decomposed DCNN, which resolves these issues without the need for retraining. Our method replaces real-valued inner-product computations with binary inner-product computations in existing network models to accelerate computation of inference and decrease model size without the need for retraining. Binary computations can be done at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion-Convolutional Neural Networks