Fast Lossless Neural Compression with Integer-Only Discrete Flows
Siyu Wang, Jianfei Chen, Chongxuan Li, Jun Zhu, Bo Zhang

TL;DR
This paper introduces Integer-only Discrete Flows (IODF), a neural compression method using integer arithmetic that achieves high compression efficiency and 10x faster inference on GPUs, suitable for practical deployment.
Contribution
The paper proposes IODF, an efficient neural compressor with integer-only arithmetic and learnable binary gates, enabling fast inference while maintaining high compression rates.
Findings
Achieves 10x inference speedup on GPUs with TensorRT.
Retains high compression rates on ImageNet datasets.
Uses integer discrete flows with 8-bit quantization for invertible transformations.
Abstract
By applying entropy codecs with learned data distributions, neural compressors have significantly outperformed traditional codecs in terms of compression ratio. However, the high inference latency of neural networks hinders the deployment of neural compressors in practical applications. In this work, we propose Integer-only Discrete Flows (IODF), an efficient neural compressor with integer-only arithmetic. Our work is built upon integer discrete flows, which consists of invertible transformations between discrete random variables. We propose efficient invertible transformations with integer-only arithmetic based on 8-bit quantization. Our invertible transformation is equipped with learnable binary gates to remove redundant filters during inference. We deploy IODF with TensorRT on GPUs, achieving 10x inference speedup compared to the fastest existing neural compressors, while retaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Neural Networks and Applications
