Splatter a Video: Video Gaussian Representation for Versatile Processing

Yang-Tian Sun; Yi-Hua Huang; Lin Ma; Xiaoyang Lyu; Yan-Pei Cao,; Xiaojuan Qi

arXiv:2406.13870·cs.CV·June 27, 2024

Splatter a Video: Video Gaussian Representation for Versatile Processing

Yang-Tian Sun, Yi-Hua Huang, Lin Ma, Xiaoyang Lyu, Yan-Pei Cao,, Xiaojuan Qi

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel explicit 3D video representation using Gaussian proxies, enabling versatile video processing tasks like editing, depth estimation, and stereoscopic generation with improved modeling of complex motions.

Contribution

The paper proposes a new video Gaussian representation that models videos in 3D space with explicit Gaussians, distilling 2D priors for regularization, enhancing manipulation and processing capabilities.

Findings

01

Effective in tracking, depth refinement, and editing.

02

Versatile across multiple video processing tasks.

03

Outperforms existing implicit 3D representations.

Abstract

Video representation is a long-standing problem that is crucial for various down-stream tasks, such as tracking,depth prediction,segmentation,view synthesis,and editing. However, current methods either struggle to model complex motions due to the absence of 3D structure or rely on implicit 3D representations that are ill-suited for manipulation tasks. To address these challenges, we introduce a novel explicit 3D representation-video Gaussian representation -- that embeds a video into 3D Gaussians. Our proposed representation models video appearance in a 3D canonical space using explicit Gaussians as proxies and associates each Gaussian with 3D motions for video motion. This approach offers a more intrinsic and explicit representation than layered atlas or volumetric pixel matrices. To obtain such a representation, we distill 2D priors, such as optical flow and depth, from foundation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Splatter a Video: Video Gaussian Representation for Versatile Processing· slideslive

Taxonomy

TopicsAdvanced Data Compression Techniques · Image and Signal Denoising Methods · Chaos-based Image/Signal Encryption