HyTIP: Hybrid Temporal Information Propagation for Masked Conditional Residual Video Coding
Yi-Hsin Chen, Yi-Chen Yao, Kuan-Wei Ho, Chun-Hung Wu, Huu-Tai Phung, Martin Benjak, J\"orn Ostermann, Wen-Hsiao Peng

TL;DR
HyTIP introduces a hybrid video coding framework combining output-recurrence and hidden-to-hidden mechanisms, achieving superior rate-distortion performance with smaller buffer sizes compared to existing methods.
Contribution
This work proposes HyTIP, a novel hybrid buffering strategy that integrates explicit decoded frames and latent features for improved video coding efficiency.
Findings
Outperforms individual recurrence approaches in coding performance.
Achieves comparable results to state-of-the-art with smaller buffers.
Outperforms VTM 17.0 in PSNR-RGB and MS-SSIM-RGB metrics.
Abstract
Most frame-based learned video codecs can be interpreted as recurrent neural networks (RNNs) propagating reference information along the temporal dimension. This work revisits the limitations of the current approaches from an RNN perspective. The output-recurrence methods, which propagate decoded frames, are intuitive but impose dual constraints on the output decoded frames, leading to suboptimal rate-distortion performance. In contrast, the hidden-to-hidden connection approaches, which propagate latent features within the RNN, offer greater flexibility but require large buffer sizes. To address these issues, we propose HyTIP, a learned video coding framework that combines both mechanisms. Our hybrid buffering strategy uses explicit decoded frames and a small number of implicit latent features to achieve competitive coding performance. Experimental results show that our HyTIP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Video Coding and Compression Technologies · Generative Adversarial Networks and Image Synthesis
