Boosting Neural Representations for Videos with a Conditional Decoder

Xinjie Zhang; Ren Yang; Dailan He; Xingtong Ge; Tongda Xu; Yan Wang,; Hongwei Qin; Jun Zhang

arXiv:2402.18152·eess.IV·March 19, 2024·2 cites

Boosting Neural Representations for Videos with a Conditional Decoder

Xinjie Zhang, Ren Yang, Dailan He, Xingtong Ge, Tongda Xu, Yan Wang,, Hongwei Qin, Jun Zhang

PDF

Open Access 1 Repo

TL;DR

This paper presents a universal boosting framework for implicit neural video representations, improving their reconstruction quality, convergence speed, and codec performance through a conditional decoder and novel feature generation methods.

Contribution

It introduces a conditional decoder with a temporal-aware affine transform and sinusoidal NeRV-like blocks to enhance implicit video representations, a novel approach not previously explored.

Findings

01

Boosts baseline INRs' reconstruction quality and convergence speed.

02

Achieves superior inpainting and interpolation results.

03

Outperforms baseline INRs and rivals traditional codecs in rate-distortion performance.

Abstract

Implicit neural representations (INRs) have emerged as a promising approach for video storage and processing, showing remarkable versatility across various video tasks. However, existing methods often fail to fully leverage their representation capabilities, primarily due to inadequate alignment of intermediate features during target frame decoding. This paper introduces a universal boosting framework for current implicit video representation approaches. Specifically, we utilize a conditional decoder with a temporal-aware affine transform module, which uses the frame index as a prior condition to effectively align intermediate features with target frames. Besides, we introduce a sinusoidal NeRV-like block to generate diverse intermediate features and achieve a more balanced parameter distribution, thereby enhancing the model's capacity. With a high-frequency information-preserving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xinjie-q/boosting-nerv
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · ALIGN · Inpainting