DP-Net: Dynamic Programming Guided Deep Neural Network Compression

Dingcheng Yang; Wenjian Yu; Ao Zhou; Haoyuan Mu; Gary Yao; Xiaoyi Wang

arXiv:2003.09615·cs.LG·March 24, 2020·1 cites

DP-Net: Dynamic Programming Guided Deep Neural Network Compression

Dingcheng Yang, Wenjian Yu, Ao Zhou, Haoyuan Mu, Gary Yao, Xiaoyi Wang

PDF

Open Access

TL;DR

DP-Net introduces a dynamic programming-based approach for neural network compression, achieving higher compression ratios while maintaining accuracy, and includes hardware acceleration for efficient inference.

Contribution

It presents a novel DP-based algorithm for weight quantization and a training method for clustering-friendly DNNs, enabling superior compression.

Findings

01

77X compression on Wide ResNet achieved

02

Outperforms state-of-the-art compression methods

03

Hardware acceleration on FPGA demonstrated

Abstract

In this work, we propose an effective scheme (called DP-Net) for compressing the deep neural networks (DNNs). It includes a novel dynamic programming (DP) based algorithm to obtain the optimal solution of weight quantization and an optimization process to train a clustering-friendly DNN. Experiments showed that the DP-Net allows larger compression than the state-of-the-art counterparts while preserving accuracy. The largest 77X compression ratio on Wide ResNet is achieved by combining DP-Net with other compression techniques. Furthermore, the DP-Net is extended for compressing a robust DNN model with negligible accuracy loss. At last, a custom accelerator is designed on FPGA to speed up the inference computation with DP-Net.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization · Max Pooling