Joint Multi-scale Gated Transformer and Prior-guided Convolutional Network for Learned Image Compression
Zhengxin Chen, Xiaohai He, Tingrong Zhang, Shuhua Xiong, and Chao Ren

TL;DR
This paper introduces MGTPCN, a novel learned image compression framework combining multi-scale gated transformers and prior-guided convolutions, achieving superior compression performance and efficiency over existing methods.
Contribution
It proposes a joint architecture with a new prior-guided convolution and multi-scale gated transformer to enhance feature extraction in learned image compression.
Findings
Outperforms state-of-the-art algorithms in compression quality
Achieves better performance-complexity trade-off
Effectively extracts multi-scale and high-frequency features
Abstract
Recently, learned image compression methods have made remarkable achievements, some of which have outperformed the traditional image codec VVC. The advantages of learned image compression methods over traditional image codecs can be largely attributed to their powerful nonlinear transform coding. Convolutional layers and shifted window transformer (Swin-T) blocks are the basic units of neural networks, and their representation capabilities play an important role in nonlinear transform coding. In this paper, to improve the ability of the vanilla convolution to extract local features, we propose a novel prior-guided convolution (PGConv), where asymmetric convolutions (AConvs) and difference convolutions (DConvs) are introduced to strengthen skeleton elements and extract high-frequency information, respectively. A re-parameterization strategy is also used to reduce the computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Advanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis
