PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs

Lukas Meiner; Jens Mehnert; Alexandru Paul Condurache

arXiv:2505.03254·cs.CV·August 7, 2025

PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs

Lukas Meiner, Jens Mehnert, Alexandru Paul Condurache

PDF

Open Access

TL;DR

PROM introduces a novel quantization method for depthwise-separable CNNs, selectively using ternary and 8-bit weights to significantly reduce energy and storage costs while maintaining accuracy.

Contribution

It proposes a simple, effective quantization scheme that assigns different bit-widths to different parts of modern CNNs, optimizing efficiency.

Findings

01

Reduces energy cost by 23.9x on MobileNetV2

02

Decreases storage size by 2.7x compared to float16

03

Maintains similar classification accuracy on ImageNet

Abstract

Convolutional neural networks (CNNs) are crucial for computer vision tasks on resource-constrained devices. Quantization effectively compresses these models, reducing storage size and energy cost. However, in modern depthwise-separable architectures, the computational cost is distributed unevenly across its components, with pointwise operations being the most expensive. By applying a general quantization scheme to this imbalanced cost distribution, existing quantization approaches fail to fully exploit potential efficiency gains. To this end, we introduce PROM, a straightforward approach for quantizing modern depthwise-separable convolutional networks by selectively using two distinct bit-widths. Specifically, pointwise convolutions are quantized to ternary weights, while the remaining modules use 8-bit weights, which is achieved through a simple quantization-aware training procedure.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Generative Adversarial Networks and Image Synthesis

MethodsPointwise Convolution · Depthwise Convolution · Depthwise Separable Convolution · Batch Normalization · Average Pooling · Inverted Residual Block · Convolution · 1x1 Convolution