Transformer-based Image Compression

Ming Lu; Peiyao Guo; Huiqing Shi; Chuntong Cao; and Zhan Ma

arXiv:2111.06707·eess.IV·November 15, 2021

Transformer-based Image Compression

Ming Lu, Peiyao Guo, Huiqing Shi, Chuntong Cao, and Zhan Ma

PDF

Open Access

TL;DR

This paper introduces a Transformer-based image compression method that leverages a VAE architecture with novel neural transformation units and attention modules, achieving competitive performance with fewer parameters.

Contribution

The paper presents a new Transformer-based image compression framework using NTUs and a casual attention module, reducing model size while maintaining high compression quality.

Findings

01

Outperforms state-of-the-art CNN-based LIC methods

02

Requires up to 45% fewer model parameters

03

Achieves comparable results to VVC intra profile

Abstract

A Transformer-based Image Compression (TIC) approach is developed which reuses the canonical variational autoencoder (VAE) architecture with paired main and hyper encoder-decoders. Both main and hyper encoders are comprised of a sequence of neural transformation units (NTUs) to analyse and aggregate important information for more compact representation of input image, while the decoders mirror the encoder-side operations to generate pixel-domain image reconstruction from the compressed bitstream. Each NTU is consist of a Swin Transformer Block (STB) and a convolutional layer (Conv) to best embed both long-range and short-range information; In the meantime, a casual attention module (CAM) is devised for adaptive context modeling of latent features to utilize both hyper and autoregressive priors. The TIC rivals with state-of-the-art approaches including deep convolutional neural networks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Image and Signal Denoising Methods · Advanced Image Processing Techniques

MethodsAttention Is All You Need · Linear Layer · Dropout · Softmax · Stochastic Depth · Residual Connection · Position-Wise Feed-Forward Layer · Adam · Multi-Head Attention · Swin Transformer