End-to-end Transformer for Compressed Video Quality Enhancement
Li Yu, Wenshuai Chang, Shiyu Wu, Moncef Gabbouj

TL;DR
This paper introduces a novel transformer-based approach for compressed video quality enhancement that effectively models spatiotemporal features while improving efficiency and outperforming existing methods in quality and speed.
Contribution
The paper proposes a transformer-based framework with Swin-AutoEncoder and channel-wise attention modules, enhancing correlation modeling and efficiency in compressed video quality enhancement.
Findings
Outperforms existing methods in quality metrics.
Achieves higher inference speed and lower GPU consumption.
Demonstrates superior subjective and objective quality on test sequences.
Abstract
Convolutional neural networks have achieved excellent results in compressed video quality enhancement task in recent years. State-of-the-art methods explore the spatiotemporal information of adjacent frames mainly by deformable convolution. However, offset fields in deformable convolution are difficult to train, and its instability in training often leads to offset overflow, which reduce the efficiency of correlation modeling. In this work, we propose a transformer-based compressed video quality enhancement (TVQE) method, consisting of Swin-AutoEncoder based Spatio-Temporal feature Fusion (SSTF) module and Channel-wise Attention based Quality Enhancement (CAQE) module. The proposed SSTF module learns both local and global features with the help of Swin-AutoEncoder, which improves the ability of correlation modeling. Meanwhile, the window mechanism-based Swin Transformer and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Image Enhancement Techniques · Advanced Image Processing Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Test · Adam · Position-Wise Feed-Forward Layer · Linear Layer · Label Smoothing · Absolute Position Encodings · Layer Normalization · Byte Pair Encoding
