Multi-Modality Deep Network for Extreme Learned Image Compression

Xuhao Jiang; Weimin Tan; Tian Tan; Bo Yan; Liquan Shen

arXiv:2304.13583·eess.IV·April 27, 2023·1 cites

Multi-Modality Deep Network for Extreme Learned Image Compression

Xuhao Jiang, Weimin Tan, Tian Tan, Bo Yan, Liquan Shen

PDF

Open Access 1 Video

TL;DR

This paper introduces a multimodal deep learning approach for image compression guided by text, significantly improving visual quality at extremely low bitrates by leveraging semantic information.

Contribution

It proposes a novel text-guided multimodal compression framework with specialized modules and loss functions, outperforming existing methods at lower bitrates.

Findings

01

Achieves visually pleasing results at extremely low bitrates

02

Outperforms state-of-the-art methods at 2x to 4x lower bitrates

03

User studies confirm superior visual quality

Abstract

Image-based single-modality compression learning approaches have demonstrated exceptionally powerful encoding and decoding capabilities in the past few years , but suffer from blur and severe semantics loss at extremely low bitrates. To address this issue, we propose a multimodal machine learning method for text-guided image compression, in which the semantic information of text is used as prior information to guide image compression for better compression performance. We fully study the role of text description in different components of the codec, and demonstrate its effectiveness. In addition, we adopt the image-text attention module and image-request complement module to better fuse image and text features, and propose an improved multimodal semantic-consistent loss to produce semantically complete reconstructions. Extensive experiments, including a user study, prove that our method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Multi-Modality Deep Network for Extreme Learned Image Compression· underline

Taxonomy

TopicsAdvanced Data Compression Techniques · Image Processing Techniques and Applications · Advanced Image and Video Retrieval Techniques