SMC++: Masked Learning of Unsupervised Video Semantic Compression

Yuan Tian; Xiaoyue Ling; Cong Geng; Qiang Hu; Guo Lu; Guangtao Zhai

arXiv:2406.04765·cs.CV·October 14, 2025·2 cites

SMC++: Masked Learning of Unsupervised Video Semantic Compression

Yuan Tian, Xiaoyue Ling, Cong Geng, Qiang Hu, Guo Lu, Guangtao Zhai

PDF

Open Access 1 Repo

TL;DR

This paper introduces SMC++, a novel video compression framework that emphasizes preserving semantic information for downstream analysis, utilizing masked video modeling, entropy regularization, and Transformer-based modules.

Contribution

It proposes a self-supervised semantic-preserving compression method with a new masked motion prediction and Transformer-based compression, advancing semantic video coding.

Findings

01

Outperforms traditional and learnable codecs on multiple datasets

02

Enhances downstream video analysis tasks

03

Effectively preserves semantic content during compression

Abstract

Most video compression methods focus on human visual perception, neglecting semantic preservation. This leads to severe semantic loss during the compression, hampering downstream video analysis tasks. In this paper, we propose a Masked Video Modeling (MVM)-powered compression framework that particularly preserves video semantics, by jointly mining and compressing the semantics in a self-supervised manner. While MVM is proficient at learning generalizable semantics through the masked patch prediction task, it may also encode non-semantic information like trivial textural details, wasting bitcost and bringing semantic noises. To suppress this, we explicitly regularize the non-semantic entropy of the compressed video in the MVM token space. The proposed framework is instantiated as a simple Semantic-Mining-then-Compression (SMC) model. Furthermore, we extend SMC as an advanced SMC++ model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tianyuan168326/videosemanticcompression-pytorch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Video Analysis and Summarization

MethodsFocus · ALIGN