Distributed Image Compression with Multimodal Side Information at Extremely Low Bitrates

Guojun Xu; Mingyang Zhang; Jianwen Xiang; Cheng Tan; Yanchao Yang; Junwei Zhou

arXiv:2605.22061·cs.CV·May 22, 2026

Distributed Image Compression with Multimodal Side Information at Extremely Low Bitrates

Guojun Xu, Mingyang Zhang, Jianwen Xiang, Cheng Tan, Yanchao Yang, Junwei Zhou

PDF

TL;DR

This paper introduces a multimodal distributed image compression framework that leverages side information through a diffusion-based decoder and feature masking to improve image quality at extremely low bitrates.

Contribution

It presents the first multimodal approach to distributed image compression, effectively utilizing side information for enhanced reconstruction quality.

Findings

01

Achieves state-of-the-art perceptual quality at < 0.1 bpp.

02

Effectively preserves fine details and global semantics.

03

Outperforms existing methods on KITTI Stereo and Cityscapes datasets.

Abstract

Distributed Image Compression (DIC) is crucial for multi-view transmission, especially when operating at extremely low bitrates (< 0.1 bpp). Its core challenge is effectively utilizing side information to achieve high-quality reconstruction under strict bitrate budgets. However, existing DIC approaches struggle to exploit global context and object-level details from side information, leading to local blurring and the loss of fine details in the reconstruction. To address these limitations, we propose a Multimodal DIC framework (MDIC), which, for the first time, leverages side information in a multimodal manner into the DIC paradigm, effectively preserving fine-grained local details and enhancing global perceptual quality in reconstructed images. Specifically, we introduce a text-to-image diffusion-based decoder conditioned on textual side information extracted from correlated images to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.