Pixel Embedding: Fully Quantized Convolutional Neural Network with   Differentiable Lookup Table

Hiroyuki Tokunaga; Joel Nicholls; Daria Vazhenina; Atsunori Kanemura

arXiv:2407.16174·cs.LG·July 24, 2024

Pixel Embedding: Fully Quantized Convolutional Neural Network with Differentiable Lookup Table

Hiroyuki Tokunaga, Joel Nicholls, Daria Vazhenina, Atsunori Kanemura

PDF

TL;DR

This paper introduces pixel embedding, a method that replaces input pixels with trainable, quantized vectors using a differentiable lookup table, significantly improving the accuracy and speed of fully quantized neural networks.

Contribution

It proposes a novel pixel embedding technique that enables fully quantized neural networks to better represent high-bit input data through differentiable lookup tables.

Findings

01

Reduces top-5 error gap on ImageNet to 1% with quantized first layer.

02

Achieves over 1.7x inference speedup compared to floating point.

03

Demonstrates effectiveness on CIFAR-100 with minimal error gap.

Abstract

By quantizing network weights and activations to low bitwidth, we can obtain hardware-friendly and energy-efficient networks. However, existing quantization techniques utilizing the straight-through estimator and piecewise constant functions face the issue of how to represent originally high-bit input data with low-bit values. To fully quantize deep neural networks, we propose pixel embedding, which replaces each float-valued input pixel with a vector of quantized values by using a lookup table. The lookup table or low-bit representation of pixels is differentiable and trainable by backpropagation. Such replacement of inputs with vectors is similar to word embedding in the natural language processing field. Experiments on ImageNet and CIFAR-100 show that pixel embedding reduces the top-5 error gap caused by quantizing the floating points at the first layer to only 1% for the ImageNet…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.