Fixed-point optimization of deep neural networks with adaptive step size   retraining

Sungho Shin; Yoonho Boo; and Wonyong Sung

arXiv:1702.08171·cs.LG·February 28, 2017·2 cites

Fixed-point optimization of deep neural networks with adaptive step size retraining

Sungho Shin, Yoonho Boo, and Wonyong Sung

PDF

Open Access

TL;DR

This paper introduces an improved fixed-point optimization algorithm for deep neural networks that dynamically estimates quantization step size during retraining, enhancing low-precision model performance across various network types.

Contribution

It proposes a novel adaptive step size estimation method and a gradual quantization scheme for more effective fixed-point optimization of neural networks.

Findings

01

Dynamic step size estimation improves quantization accuracy.

02

Gradual quantization enhances low-precision neural network performance.

03

Applicable to FFDNNs, CNNs, and RNNs.

Abstract

Fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations. Many deep neural networks show fairly good performance even with 2- or 3-bit precision when quantized weights are fine-tuned by retraining. We propose an improved fixedpoint optimization algorithm that estimates the quantization step size dynamically during the retraining. In addition, a gradual quantization scheme is also tested, which sequentially applies fixed-point optimizations from high- to low-precision. The experiments are conducted for feed-forward deep neural networks (FFDNNs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Filter Design and Implementation · Advanced Image Processing Techniques · Model Reduction and Neural Networks