Predicting Satisfied User and Machine Ratio for Compressed Images: A   Unified Approach

Qi Zhang; Shanshe Wang; Xinfeng Zhang; Siwei Ma; Jingshan Pan; Wen Gao

arXiv:2412.17477·cs.CV·December 24, 2024

Predicting Satisfied User and Machine Ratio for Compressed Images: A Unified Approach

Qi Zhang, Shanshe Wang, Xinfeng Zhang, Siwei Ma, Jingshan Pan, Wen Gao

PDF

Open Access

TL;DR

This paper introduces a deep learning model that simultaneously predicts human satisfaction and machine perception of compressed images, aiding in optimized image compression for both viewers and analysis systems.

Contribution

A unified deep learning framework that predicts SUR and SMR for compressed images, incorporating novel modules like DFRL and MHAAP for enhanced feature discrimination and aggregation.

Findings

01

Model significantly outperforms existing methods in SUR and SMR prediction.

02

Joint learning improves accuracy for both human and machine perceptual quality.

03

Pre-training on large-scale datasets enhances prediction robustness.

Abstract

Nowadays, high-quality images are pursued by both humans for better viewing experience and by machines for more accurate visual analysis. However, images are usually compressed before being consumed, decreasing their quality. It is meaningful to predict the perceptual quality of compressed images for both humans and machines, which guides the optimization for compression. In this paper, we propose a unified approach to address this. Specifically, we create a deep learning-based model to predict Satisfied User Ratio (SUR) and Satisfied Machine Ratio (SMR) of compressed images simultaneously. We first pre-train a feature extractor network on a large-scale SMR-annotated dataset with human perception-related quality labels generated by diverse image quality models, which simulates the acquisition of SUR labels. Then, we propose an MLP-Mixer-based network to predict SUR and SMR by leveraging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Video Quality Assessment

MethodsAttention Is All You Need · Linear Layer · Softmax · Multi-Head Attention