Towards End-to-End Image Compression and Analysis with Transformers

Yuanchao Bai; Xu Yang; Xianming Liu; Junjun Jiang; Yaowei Wang,; Xiangyang Ji; Wen Gao

arXiv:2112.09300·cs.CV·December 20, 2021

Towards End-to-End Image Compression and Analysis with Transformers

Yuanchao Bai, Xu Yang, Xianming Liu, Junjun Jiang, Yaowei Wang,, Xiangyang Ji, Wen Gao

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces an end-to-end image compression and analysis model using Transformers, which improves compression and classification by integrating compressed features with Transformer-based long-term information.

Contribution

It redesigns the Vision Transformer to operate directly on compressed features and introduces a feature aggregation module for enhanced compression and reconstruction.

Findings

01

Effective in both image compression and classification tasks

02

Improves compression performance by leveraging Transformer long-term information

03

Achieves competitive results with a novel two-step training strategy

Abstract

We propose an end-to-end image compression and analysis model with Transformers, targeting to the cloud-based image classification application. Instead of placing an existing Transformer-based image classification model directly after an image codec, we aim to redesign the Vision Transformer (ViT) model to perform image classification from the compressed features and facilitate image compression with the long-term information from the Transformer. Specifically, we first replace the patchify stem (i.e., image splitting and embedding) of the ViT model with a lightweight image encoder modelled by a convolutional neural network. The compressed features generated by the image encoder are injected convolutional inductive bias and are fed to the Transformer for image classification bypassing image reconstruction. Meanwhile, we propose a feature aggregation module to fuse the compressed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bychao100/towards-image-compression-and-analysis-with-transformers
pytorchOfficial

Videos

Towards End-to-End Image Compression and Analysis with Transformers· underline

Taxonomy

TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Image Processing Techniques and Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Adam · Residual Connection · Layer Normalization · Absolute Position Encodings · Dropout · Label Smoothing