TL;DR
This paper introduces a novel hierarchical 4D Gaussian representation for long volumetric videos, enabling efficient, high-quality rendering with constant memory usage regardless of video length.
Contribution
The paper proposes the Temporal Gaussian Hierarchy, a multi-level 4D Gaussian model that adaptively captures scene dynamics and reduces memory footprint for long volumetric videos.
Findings
Outperforms existing methods in training cost, rendering speed, and storage.
Handles minutes-long volumetric videos efficiently while maintaining quality.
Achieves nearly constant GPU memory usage during training and rendering.
Abstract
This paper aims to address the challenge of reconstructing long volumetric videos from multi-view RGB videos. Recent dynamic view synthesis methods leverage powerful 4D representations, like feature grids or point cloud sequences, to achieve high-quality rendering results. However, they are typically limited to short (1~2s) video clips and often suffer from large memory footprints when dealing with longer videos. To solve this issue, we propose a novel 4D representation, named Temporal Gaussian Hierarchy, to compactly model long volumetric videos. Our key observation is that there are generally various degrees of temporal redundancy in dynamic scenes, which consist of areas changing at different speeds. Motivated by this, our approach builds a multi-level hierarchy of 4D Gaussian primitives, where each level separately describes scene regions with different degrees of content change,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
