Spatial-Temporal Residue Network Based In-Loop Filter for Video Coding
Chuanmin Jia, Shiqi Wang, Xinfeng Zhang, Shanshe Wang, Siwei Ma

TL;DR
This paper introduces a lightweight spatial-temporal residue network for in-loop filtering in video coding, effectively reducing artifacts and achieving up to 5.1% bit-rate savings.
Contribution
It proposes a novel, simple four-layer neural network that jointly exploits spatial and temporal information for in-loop filtering, with adaptive control for improved performance.
Findings
Achieves up to 5.1% bit-rate reduction.
Uses only four convolution layers for efficiency.
Incorporates rate-distortion optimized control flags.
Abstract
Deep learning has demonstrated tremendous break through in the area of image/video processing. In this paper, a spatial-temporal residue network (STResNet) based in-loop filter is proposed to suppress visual artifacts such as blocking, ringing in video coding. Specifically, the spatial and temporal information is jointly exploited by taking both current block and co-located block in reference frame into consideration during the processing of in-loop filter. The architecture of STResNet only consists of four convolution layers which shows hospitality to memory and coding complexity. Moreover, to fully adapt the input content and improve the performance of the proposed in-loop filter, coding tree unit (CTU) level control flag is applied in the sense of rate-distortion optimization. Extensive experimental results show that our scheme provides up to 5.1% bit-rate reduction compared to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
