Joint Multi-scale Gated Transformer and Prior-guided Convolutional Network for Learned Image Compression

Zhengxin Chen; Xiaohai He; Tingrong Zhang; Shuhua Xiong; and Chao Ren

arXiv:2512.00744·cs.CV·December 2, 2025

Joint Multi-scale Gated Transformer and Prior-guided Convolutional Network for Learned Image Compression

Zhengxin Chen, Xiaohai He, Tingrong Zhang, Shuhua Xiong, and Chao Ren

PDF

Open Access

TL;DR

This paper introduces MGTPCN, a novel learned image compression framework combining multi-scale gated transformers and prior-guided convolutions, achieving superior compression performance and efficiency over existing methods.

Contribution

It proposes a joint architecture with a new prior-guided convolution and multi-scale gated transformer to enhance feature extraction in learned image compression.

Findings

01

Outperforms state-of-the-art algorithms in compression quality

02

Achieves better performance-complexity trade-off

03

Effectively extracts multi-scale and high-frequency features

Abstract

Recently, learned image compression methods have made remarkable achievements, some of which have outperformed the traditional image codec VVC. The advantages of learned image compression methods over traditional image codecs can be largely attributed to their powerful nonlinear transform coding. Convolutional layers and shifted window transformer (Swin-T) blocks are the basic units of neural networks, and their representation capabilities play an important role in nonlinear transform coding. In this paper, to improve the ability of the vanilla convolution to extract local features, we propose a novel prior-guided convolution (PGConv), where asymmetric convolutions (AConvs) and difference convolutions (DConvs) are introduced to strengthen skeleton elements and extract high-frequency information, respectively. A re-parameterization strategy is also used to reduce the computational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Advanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis