TL;DR
This paper introduces a novel layered neural representation called ST-NeRF for generating editable, photo-realistic free-viewpoint videos of large dynamic scenes using only 16 cameras, enabling scene manipulation and high realism.
Contribution
It presents the first layered neural approach for editable free-viewpoint video of large scenes, incorporating scene parsing, continuous deformation, and object-aware rendering for enhanced editing capabilities.
Findings
Achieves high-quality, photo-realistic free-viewpoint videos.
Supports scene editing such as scaling, relocating, duplicating, and retiming.
Demonstrates effectiveness on large dynamic scenes with multiple performers.
Abstract
Generating free-viewpoint videos is critical for immersive VR/AR experience but recent neural advances still lack the editing ability to manipulate the visual perception for large dynamic scenes. To fill this gap, in this paper we propose the first approach for editable photo-realistic free-viewpoint video generation for large-scale dynamic scenes using only sparse 16 cameras. The core of our approach is a new layered neural representation, where each dynamic entity including the environment itself is formulated into a space-time coherent neural layered radiance representation called ST-NeRF. Such layered representation supports fully perception and realistic manipulation of the dynamic scene whilst still supporting a free viewing experience in a wide range. In our ST-NeRF, the dynamic entity/layer is represented as continuous functions, which achieves the disentanglement of location,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
