End-to-end Transformer for Compressed Video Quality Enhancement

Li Yu; Wenshuai Chang; Shiyu Wu; Moncef Gabbouj

arXiv:2210.13827·cs.MM·October 26, 2022·1 cites

End-to-end Transformer for Compressed Video Quality Enhancement

Li Yu, Wenshuai Chang, Shiyu Wu, Moncef Gabbouj

PDF

Open Access

TL;DR

This paper introduces a novel transformer-based approach for compressed video quality enhancement that effectively models spatiotemporal features while improving efficiency and outperforming existing methods in quality and speed.

Contribution

The paper proposes a transformer-based framework with Swin-AutoEncoder and channel-wise attention modules, enhancing correlation modeling and efficiency in compressed video quality enhancement.

Findings

01

Outperforms existing methods in quality metrics.

02

Achieves higher inference speed and lower GPU consumption.

03

Demonstrates superior subjective and objective quality on test sequences.

Abstract

Convolutional neural networks have achieved excellent results in compressed video quality enhancement task in recent years. State-of-the-art methods explore the spatiotemporal information of adjacent frames mainly by deformable convolution. However, offset fields in deformable convolution are difficult to train, and its instability in training often leads to offset overflow, which reduce the efficiency of correlation modeling. In this work, we propose a transformer-based compressed video quality enhancement (TVQE) method, consisting of Swin-AutoEncoder based Spatio-Temporal feature Fusion (SSTF) module and Channel-wise Attention based Quality Enhancement (CAQE) module. The proposed SSTF module learns both local and global features with the help of Swin-AutoEncoder, which improves the ability of correlation modeling. Meanwhile, the window mechanism-based Swin Transformer and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Video Quality Assessment · Image Enhancement Techniques · Advanced Image Processing Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Test · Adam · Position-Wise Feed-Forward Layer · Linear Layer · Label Smoothing · Absolute Position Encodings · Layer Normalization · Byte Pair Encoding