Bi-Level Spatial and Channel-aware Transformer for Learned Image   Compression

Hamidreza Soltani; Erfan Ghasemi

arXiv:2408.03842·cs.CV·August 8, 2024

Bi-Level Spatial and Channel-aware Transformer for Learned Image Compression

Hamidreza Soltani, Erfan Ghasemi

PDF

Open Access

TL;DR

This paper introduces a novel Transformer-based image compression method that incorporates frequency-aware attention mechanisms and a mixed feed-forward network, significantly improving compression efficiency over existing learned image codecs.

Contribution

The paper proposes a hybrid spatial-channel attention Transformer with frequency-aware modules and a mixed local-global network, advancing the transformation stage in learned image compression.

Findings

01

Outperforms state-of-the-art LIC methods in rate-distortion metrics.

02

Enhances feature decorrelation through frequency-aware attention mechanisms.

03

Improves compression efficiency with novel Transformer architecture.

Abstract

Recent advancements in learned image compression (LIC) methods have demonstrated superior performance over traditional hand-crafted codecs. These learning-based methods often employ convolutional neural networks (CNNs) or Transformer-based architectures. However, these nonlinear approaches frequently overlook the frequency characteristics of images, which limits their compression efficiency. To address this issue, we propose a novel Transformer-based image compression method that enhances the transformation stage by considering frequency components within the feature map. Our method integrates a novel Hybrid Spatial-Channel Attention Transformer Block (HSCATB), where a spatial-based branch independently handles high and low frequencies at the attention layer, and a Channel-aware Self-Attention (CaSA) module captures information across channels, significantly improving compression…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques

MethodsLinear Layer · Residual Connection · Multi-Head Attention · Attention Is All You Need · Position-Wise Feed-Forward Layer · Adam · Byte Pair Encoding · Softmax · Absolute Position Encodings · Dense Connections