Effortless Cross-Platform Video Codec: A Codebook-Based Method

Kuan Tian; Yonghang Guan; Jinxi Xiang; Jun Zhang; Xiao Han; and Wei Yang

arXiv:2310.10292·cs.CV·October 17, 2023·1 cites

Effortless Cross-Platform Video Codec: A Codebook-Based Method

Kuan Tian, Yonghang Guan, Jinxi Xiang, Jun Zhang, Xiao Han, and Wei Yang

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a cross-platform video codec that uses codebooks and a cross-attention mechanism, eliminating entropy models and ensuring consistent decoding across different hardware platforms, with competitive compression performance.

Contribution

The proposed framework removes the need for entropy models and optical flow, enabling efficient, cross-platform video compression with consistent decoding and improved performance over traditional codecs.

Findings

01

Outperforms H.265 (medium) in compression quality.

02

Eliminates entropy model inconsistencies across platforms.

03

Achieves high efficiency without optical flow or autoregressive models.

Abstract

Under certain circumstances, advanced neural video codecs can surpass the most complex traditional codecs in their rate-distortion (RD) performance. One of the main reasons for the high performance of existing neural video codecs is the use of the entropy model, which can provide more accurate probability distribution estimations for compressing the latents. This also implies the rigorous requirement that entropy models running on different platforms should use consistent distribution estimations. However, in cross-platform scenarios, entropy models running on different platforms usually yield inconsistent probability distribution estimations due to floating point computation errors that are platform-dependent, which can cause the decoding side to fail in correctly decoding the compressed bitstream sent by the encoding side. In this paper, we propose a cross-platform video compression…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

- The paper is well-written and flows smoothly. - The motivation for the proposed method seems intriguing in the context of neural video compression.

Weaknesses

1. **More RD-performance comparison with existing neural video compression methods**: The paper primarily made a comparison with traditional codecs like H.264 and H.265. While Figure 1 demonstrates the artifacts induced by entropy models in cross-platform settings, the paper does not conclusively establish if this issue is prevalent across all neural video compression methods. To strengthen the paper's claims, a more comprehensive RD-performance comparison with a variety of neural methods in sim

Reviewer 02Rating 8· accept, good paperConfidence 4

Strengths

1.Novelity of the paper is good. 2. Encourging results.

Weaknesses

1.Related work should be updated.

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 5

Strengths

1. This paper focuses on an important and practical issue of neural video compression. And it proposes one reasonable solution. 2. The design of the multi-stage codebook and window-based cross-attention successfully replaces the common motion compensation and autoregressive modules. 3. The performance of its proposed method is acceptable, which is higher than two common traditional codecs H.265 and H.264.

Weaknesses

1. The novelty of this paper may be limited as the overall framework bears resemblance to Mentzer et al.'s (2022) approach, which avoids explicit motion estimation by employing a transform-like architecture. Additionally, the use of vector quantization for compression is not new and may have been inspired by Zhu's work in image compression. Furthermore, the windows-based cross attention appears similar to SwinTransformer. Consequently, the major architecture and designs lack sufficient novelty.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Advanced Vision and Imaging

MethodsConcatenated Skip Connection · Softmax