HyTIP: Hybrid Temporal Information Propagation for Masked Conditional Residual Video Coding

Yi-Hsin Chen; Yi-Chen Yao; Kuan-Wei Ho; Chun-Hung Wu; Huu-Tai Phung; Martin Benjak; J\"orn Ostermann; Wen-Hsiao Peng

arXiv:2508.02072·eess.IV·August 26, 2025

HyTIP: Hybrid Temporal Information Propagation for Masked Conditional Residual Video Coding

Yi-Hsin Chen, Yi-Chen Yao, Kuan-Wei Ho, Chun-Hung Wu, Huu-Tai Phung, Martin Benjak, J\"orn Ostermann, Wen-Hsiao Peng

PDF

Open Access

TL;DR

HyTIP introduces a hybrid video coding framework combining output-recurrence and hidden-to-hidden mechanisms, achieving superior rate-distortion performance with smaller buffer sizes compared to existing methods.

Contribution

This work proposes HyTIP, a novel hybrid buffering strategy that integrates explicit decoded frames and latent features for improved video coding efficiency.

Findings

01

Outperforms individual recurrence approaches in coding performance.

02

Achieves comparable results to state-of-the-art with smaller buffers.

03

Outperforms VTM 17.0 in PSNR-RGB and MS-SSIM-RGB metrics.

Abstract

Most frame-based learned video codecs can be interpreted as recurrent neural networks (RNNs) propagating reference information along the temporal dimension. This work revisits the limitations of the current approaches from an RNN perspective. The output-recurrence methods, which propagate decoded frames, are intuitive but impose dual constraints on the output decoded frames, leading to suboptimal rate-distortion performance. In contrast, the hidden-to-hidden connection approaches, which propagate latent features within the RNN, offer greater flexibility but require large buffer sizes. To address these issues, we propose HyTIP, a learned video coding framework that combines both mechanisms. Our hybrid buffering strategy uses explicit decoded frames and a small number of implicit latent features to achieve competitive coding performance. Experimental results show that our HyTIP…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Video Coding and Compression Technologies · Generative Adversarial Networks and Image Synthesis