TL;DR
SNeRV introduces a spectral decomposition approach using wavelets to improve neural video representations by better capturing fine details and motion, addressing spectral bias in existing methods.
Contribution
The paper proposes SNeRV, a novel spectral-preserving neural network that decomposes video into frequency components with wavelets and extends this to temporal domain for improved video reconstruction.
Findings
Outperforms existing NeRV models in detail preservation
Effectively captures fine textures and motion patterns
Enhances implicit video representations with spectral decomposition
Abstract
Neural representation for video (NeRV), which employs a neural network to parameterize video signals, introduces a novel methodology in video representations. However, existing NeRV-based methods have difficulty in capturing fine spatial details and motion patterns due to spectral bias, in which a neural network learns high-frequency (HF) components at a slower rate than low-frequency (LF) components. In this paper, we propose spectra-preserving NeRV (SNeRV) as a novel approach to enhance implicit video representations by efficiently handling various frequency components. SNeRV uses 2D discrete wavelet transform (DWT) to decompose video into LF and HF features, preserving spatial structures and directly addressing the spectral bias issue. To balance the compactness, we encode only the LF components, while HF components that include fine textures are generated by a decoder. Specialized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
