Perceptual Video Coding for Machines via Satisfied Machine Ratio   Modeling

Qi Zhang; Shanshe Wang; Xinfeng Zhang; Chuanmin Jia; Zhao Wang; Siwei; Ma; Wen Gao

arXiv:2211.06797·cs.CV·January 10, 2024

Perceptual Video Coding for Machines via Satisfied Machine Ratio Modeling

Qi Zhang, Shanshe Wang, Xinfeng Zhang, Chuanmin Jia, Zhao Wang, Siwei, Ma, Wen Gao

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Satisfied Machine Ratio (SMR), a new metric for evaluating and improving video compression tailored for machine analysis, enhancing efficiency and generalizability across various machines and codecs.

Contribution

It proposes the SMR metric, builds a large-scale SMR dataset, and develops a prediction model that significantly improves machine-oriented video compression performance.

Findings

01

SMR models enhance compression efficiency for machines.

02

The models generalize well to unseen machines and codecs.

03

Extensive experiments validate the effectiveness of SMR-based coding.

Abstract

Video Coding for Machines (VCM) aims to compress visual signals for machine analysis. However, existing methods only consider a few machines, neglecting the majority. Moreover, the machine's perceptual characteristics are not leveraged effectively, resulting in suboptimal compression efficiency. To overcome these limitations, this paper introduces Satisfied Machine Ratio (SMR), a metric that statistically evaluates the perceptual quality of compressed images and videos for machines by aggregating satisfaction scores from them. Each score is derived from machine perceptual differences between original and compressed images. Targeting image classification and object detection tasks, we build two representative machine libraries for SMR annotation and create a large-scale SMR dataset to facilitate SMR studies. We then propose an SMR prediction model based on the correlation between deep…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ywwynm/smr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Image and Video Quality Assessment · Generative Adversarial Networks and Image Synthesis