RMT-BVQA: Recurrent Memory Transformer-based Blind Video Quality Assessment for Enhanced Video Content

Tianhao Peng; Chen Feng; Duolikun Danier; Fan Zhang; Benoit Vallade; Alex Mackin; David Bull

arXiv:2405.08621·eess.IV·June 10, 2025

RMT-BVQA: Recurrent Memory Transformer-based Blind Video Quality Assessment for Enhanced Video Content

Tianhao Peng, Chen Feng, Duolikun Danier, Fan Zhang, Benoit Vallade, Alex Mackin, David Bull

PDF

Open Access

TL;DR

This paper introduces RMT-BVQA, a novel deep learning-based blind video quality assessment method tailored for enhanced videos, utilizing a recurrent memory transformer and contrastive learning to outperform existing metrics.

Contribution

It presents a new RMT-based network architecture and a contrastive learning strategy specifically designed for assessing the quality of enhanced video content.

Findings

01

Outperforms ten existing no-reference quality metrics.

02

Shows superior correlation performance on the VDPVE dataset.

03

Utilizes a new database with 13K training patches for optimization.

Abstract

With recent advances in deep learning, numerous algorithms have been developed to enhance video quality, reduce visual artifacts, and improve perceptual quality. However, little research has been reported on the quality assessment of enhanced content - the evaluation of enhancement methods is often based on quality metrics that were designed for compression applications. In this paper, we propose a novel blind deep video quality assessment (VQA) method specifically for enhanced video content. It employs a new Recurrent Memory Transformer (RMT) based network architecture to obtain video quality representations, which is optimized through a novel content-quality-aware contrastive learning strategy based on a new database containing 13K training patches with enhanced content. The extracted quality representations are then combined through linear regression to generate video-level quality…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Video Quality Assessment · Advanced Optical Imaging Technologies · Advanced Image Processing Techniques

MethodsLinear Layer · Multi-Head Attention · Dense Connections · Position-Wise Feed-Forward Layer · Dropout · Label Smoothing · Residual Connection · Absolute Position Encodings · Byte Pair Encoding · Adam