# Quality Assessment of In-the-Wild Videos

**Authors:** Dingquan Li, Tingting Jiang, Ming Jiang

arXiv: 1908.00375 · 2019-10-08

## TL;DR

This paper introduces a no-reference video quality assessment method leveraging human visual system effects, combining content-aware features and temporal-memory modeling within a deep neural network, validated on multiple in-the-wild video datasets.

## Contribution

It proposes a novel deep learning approach that integrates content-dependency and temporal-memory effects for improved in-the-wild video quality assessment.

## Key findings

- Outperforms five state-of-the-art methods significantly.
- Achieves over 12% to 18% performance improvements.
- Validates the importance of content features and temporal-memory modeling.

## Abstract

Quality assessment of in-the-wild videos is a challenging problem because of the absence of reference videos and shooting distortions. Knowledge of the human visual system can help establish methods for objective quality assessment of in-the-wild videos. In this work, we show two eminent effects of the human visual system, namely, content-dependency and temporal-memory effects, could be used for this purpose. We propose an objective no-reference video quality assessment method by integrating both effects into a deep neural network. For content-dependency, we extract features from a pre-trained image classification neural network for its inherent content-aware property. For temporal-memory effects, long-term dependencies, especially the temporal hysteresis, are integrated into the network with a gated recurrent unit and a subjectively-inspired temporal pooling layer. To validate the performance of our method, experiments are conducted on three publicly available in-the-wild video quality assessment databases: KoNViD-1k, CVD2014, and LIVE-Qualcomm, respectively. Experimental results demonstrate that our proposed method outperforms five state-of-the-art methods by a large margin, specifically, 12.39%, 15.71%, 15.45%, and 18.09% overall performance improvements over the second-best method VBLIINDS, in terms of SROCC, KROCC, PLCC and RMSE, respectively. Moreover, the ablation study verifies the crucial role of both the content-aware features and the modeling of temporal-memory effects. The PyTorch implementation of our method is released at https://github.com/lidq92/VSFA.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.00375/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/1908.00375/full.md

## References

58 references — full list in the complete paper: https://tomesphere.com/paper/1908.00375/full.md

---
Source: https://tomesphere.com/paper/1908.00375