Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural   Networks

Jun Nishikawa; Ryoji Ikegaya

arXiv:2011.06751·cs.CV·November 26, 2020

Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks

Jun Nishikawa, Ryoji Ikegaya

PDF

Open Access

TL;DR

This paper introduces a pruning method called PfQ to improve the fine-tuning of quantized deep neural networks by removing filters that hinder accuracy recovery, leading to better performance without extra parameters.

Contribution

The paper proposes a novel pruning technique, PfQ, that enhances quantized DNN fine-tuning by addressing batch normalization issues without additional hyper-parameters.

Findings

01

Achieves higher accuracy with similar model size compared to conventional methods

02

Effectively mitigates batch normalization effects during quantization

03

Improves fine-tuning performance of low-bit quantized DNNs

Abstract

Deep Neural Networks(DNNs) have many parameters and activation data, and these both are expensive to implement. One method to reduce the size of the DNN is to quantize the pre-trained model by using a low-bit expression for weights and activations, using fine-tuning to recover the drop in accuracy. However, it is generally difficult to train neural networks which use low-bit expressions. One reason is that the weights in the middle layer of the DNN have a wide dynamic range and so when quantizing the wide dynamic range into a few bits, the step size becomes large, which leads to a large quantization error and finally a large degradation in accuracy. To solve this problem, this paper makes the following three contributions without using any additional learning parameters and hyper-parameters. First, we analyze how batch normalization, which causes the aforementioned problem, disturbs the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsPruning