GSVC: Efficient Video Representation and Compression Through 2D Gaussian   Splatting

Longan Wang; Yuang Shi; Wei Tsang Ooi

arXiv:2501.12060·cs.CV·January 23, 2025

GSVC: Efficient Video Representation and Compression Through 2D Gaussian Splatting

Longan Wang, Yuang Shi, Wei Tsang Ooi

PDF

Open Access

TL;DR

GSVC introduces a novel method for representing and compressing videos using 2D Gaussian splats, leveraging temporal redundancy and adaptive techniques to achieve competitive quality and speed.

Contribution

The paper presents GSVC, a new approach that uses 2D Gaussian splats for efficient video compression, incorporating predictive, pruning, and dynamic addition strategies.

Findings

01

Achieves rate-distortion performance comparable to AV1 and VVC.

02

Attains rendering speeds of 1500 fps for 1080p videos.

03

Effectively captures scene dynamics and motion.

Abstract

3D Gaussian splats have emerged as a revolutionary, effective, learned representation for static 3D scenes. In this work, we explore using 2D Gaussian splats as a new primitive for representing videos. We propose GSVC, an approach to learning a set of 2D Gaussian splats that can effectively represent and compress video frames. GSVC incorporates the following techniques: (i) To exploit temporal redundancy among adjacent frames, which can speed up training and improve the compression efficiency, we predict the Gaussian splats of a frame based on its previous frame; (ii) To control the trade-offs between file size and quality, we remove Gaussian splats with low contribution to the video quality; (iii) To capture dynamics in videos, we randomly add Gaussian splats to fit content with large motion or newly-appeared objects; (iv) To handle significant changes in the scene, we detect key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Video Coding and Compression Technologies · Image and Signal Denoising Methods

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Sparse Evolutionary Training