ILMPQ : An Intra-Layer Multi-Precision Deep Neural Network Quantization   framework for FPGA

Sung-En Chang; Yanyu Li; Mengshu Sun; Yanzhi Wang; Xue Lin

arXiv:2111.00155·cs.LG·November 2, 2021·1 cites

ILMPQ : An Intra-Layer Multi-Precision Deep Neural Network Quantization framework for FPGA

Sung-En Chang, Yanyu Li, Mengshu Sun, Yanzhi Wang, Xue Lin

PDF

Open Access

TL;DR

This paper introduces ILMPQ, a novel intra-layer multi-precision quantization framework for FPGA-based DNNs, reducing computation overhead while maintaining accuracy, and demonstrating significant speedups on ImageNet classification.

Contribution

The work proposes a new intra-layer multi-precision quantization method for FPGA DNNs, differing from existing inter-layer approaches, to optimize hardware efficiency and accuracy.

Findings

01

Achieves 70.73% Top1 accuracy on ResNet-18 with ImageNet.

02

Realizes 3.65x inference speedup on FPGA devices.

03

Supports multiple precisions within layers to unify hardware configurations.

Abstract

This work targets the commonly used FPGA (field-programmable gate array) devices as the hardware platform for DNN edge computing. We focus on DNN quantization as the main model compression technique. The novelty of this work is: We use a quantization method that supports multiple precisions along the intra-layer dimension, while the existing quantization methods apply multi-precision quantization along the inter-layer dimension. The intra-layer multi-precision method can uniform the hardware configurations for different layers to reduce computation overhead and at the same time preserve the model accuracy as the inter-layer approach. Our proposed ILMPQ DNN quantization framework achieves 70.73 Top1 accuracy in ResNet-18 on the ImageNet dataset. We also validate the proposed MSP framework on two FPGA devices i.e., Xilinx XC7Z020 and XC7Z045. We achieve 3.65x speedup in end-to-end…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning