ResiComp: Loss-Resilient Image Compression via Dual-Functional Masked   Visual Token Modeling

Sixian Wang; Jincheng Dai; Xiaoqi Qin; Ke Yang; Kai Niu; and Ping; Zhang

arXiv:2502.10812·eess.IV·March 3, 2025

ResiComp: Loss-Resilient Image Compression via Dual-Functional Masked Visual Token Modeling

Sixian Wang, Jincheng Dai, Xiaoqi Qin, Ke Yang, Kai Niu, and Ping, Zhang

PDF

Open Access

TL;DR

ResiComp introduces a neural image compression framework that enhances error resilience against packet loss by integrating feature-domain concealment and masked visual token modeling, balancing compression efficiency and robustness.

Contribution

The paper presents a novel neural image codec with unified entropy modeling and packet loss concealment inspired by large language models, improving error resilience in NICs.

Findings

01

Significantly improves NIC robustness to packet loss.

02

Offers flexible modes to balance efficiency and resilience.

03

Demonstrates superior performance in experiments.

Abstract

Recent advancements in neural image codecs (NICs) are of significant compression performance, but limited attention has been paid to their error resilience. These resulting NICs tend to be sensitive to packet losses, which are prevalent in real-time communications. In this paper, we investigate how to elevate the resilience ability of NICs to combat packet losses. We propose ResiComp, a pioneering neural image compression framework with feature-domain packet loss concealment (PLC). Motivated by the inherent consistency between generation and compression, we advocate merging the tasks of entropy modeling and PLC into a unified framework focused on latent space context modeling. To this end, we take inspiration from the impressive generative capabilities of large language models (LLMs), particularly the recent advances of masked visual token modeling (MVTM). During training,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis

MethodsByte Pair Encoding · Layer Normalization · Residual Connection · Linear Layer · Attention Is All You Need · Dense Connections · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Softmax