Learned Image Compression with Generalized Octave Convolution and Cross-Resolution Parameter Estimation
Haisheng Fu, Feng Liang

TL;DR
This paper introduces a multi-resolution learned image compression framework using octave convolutions and cross-resolution parameter estimation, achieving faster decoding and improved rate-distortion performance over existing methods.
Contribution
It proposes a novel multi-resolution compression scheme that removes the need for context-adaptive models, significantly speeds up decoding, and enhances R-D performance with cross-resolution parameter estimation.
Findings
Decodes approximately 73-93% faster than state-of-the-art methods.
Achieves better R-D performance than H.266/VVC and some learning-based methods.
Maintains high image quality across various bit rates.
Abstract
The application of the context-adaptive entropy model significantly improves the rate-distortion (R-D) performance, in which hyperpriors and autoregressive models are jointly utilized to effectively capture the spatial redundancy of the latent representations. However, the latent representations still contain some spatial correlations. In addition, these methods based on the context-adaptive entropy model cannot be accelerated in the decoding process by parallel computing devices, e.g. FPGA or GPU. To alleviate these limitations, we propose a learned multi-resolution image compression framework, which exploits the recently developed octave convolutions to factorize the latent representations into the high-resolution (HR) and low-resolution (LR) parts, similar to wavelet transform, which further improves the R-D performance. To speed up the decoding, our scheme does not use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Image and Signal Denoising Methods · Advanced Image Processing Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
