PS-NeRV: Patch-wise Stylized Neural Representations for Videos
Yunpeng Bai, Chao Dong, Cairong Wang

TL;DR
PS-NeRV introduces a patch-wise implicit neural representation for videos that improves reconstruction quality and decoding speed, leveraging patch-based encoding with CNNs, MLPs, and AdaIN for enhanced high-frequency detail fitting.
Contribution
The paper proposes a novel patch-wise video representation method, PS-NeRV, combining CNNs, MLPs, and AdaIN to outperform pixel-wise and image-wise approaches.
Findings
Achieves high-quality video reconstruction with fast decoding.
Effective in video compression and inpainting tasks.
Outperforms existing implicit neural representation methods.
Abstract
We study how to represent a video with implicit neural representations (INRs). Classical INRs methods generally utilize MLPs to map input coordinates to output pixels. While some recent works have tried to directly reconstruct the whole image with CNNs. However, we argue that both the above pixel-wise and image-wise strategies are not favorable to video data. Instead, we propose a patch-wise solution, PS-NeRV, which represents videos as a function of patches and the corresponding patch coordinate. It naturally inherits the advantages of image-wise methods, and achieves excellent reconstruction performance with fast decoding speed. The whole method includes conventional modules, like positional embedding, MLPs and CNNs, while also introduces AdaIN to enhance intermediate features. These simple yet essential changes could help the network easily fit high-frequency details. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis
