ResiComp: Loss-Resilient Image Compression via Dual-Functional Masked Visual Token Modeling
Sixian Wang, Jincheng Dai, Xiaoqi Qin, Ke Yang, Kai Niu, and Ping, Zhang

TL;DR
ResiComp introduces a neural image compression framework that enhances error resilience against packet loss by integrating feature-domain concealment and masked visual token modeling, balancing compression efficiency and robustness.
Contribution
The paper presents a novel neural image codec with unified entropy modeling and packet loss concealment inspired by large language models, improving error resilience in NICs.
Findings
Significantly improves NIC robustness to packet loss.
Offers flexible modes to balance efficiency and resilience.
Demonstrates superior performance in experiments.
Abstract
Recent advancements in neural image codecs (NICs) are of significant compression performance, but limited attention has been paid to their error resilience. These resulting NICs tend to be sensitive to packet losses, which are prevalent in real-time communications. In this paper, we investigate how to elevate the resilience ability of NICs to combat packet losses. We propose ResiComp, a pioneering neural image compression framework with feature-domain packet loss concealment (PLC). Motivated by the inherent consistency between generation and compression, we advocate merging the tasks of entropy modeling and PLC into a unified framework focused on latent space context modeling. To this end, we take inspiration from the impressive generative capabilities of large language models (LLMs), particularly the recent advances of masked visual token modeling (MVTM). During training,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis
MethodsByte Pair Encoding · Layer Normalization · Residual Connection · Linear Layer · Attention Is All You Need · Dense Connections · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Softmax
