ProPainter: Improving Propagation and Transformer for Video Inpainting

Shangchen Zhou; Chongyi Li; Kelvin C.K. Chan; Chen Change Loy

arXiv:2309.03897·cs.CV·September 8, 2023·1 cites

ProPainter: Improving Propagation and Transformer for Video Inpainting

Shangchen Zhou, Chongyi Li, Kelvin C.K. Chan, Chen Change Loy

PDF

Open Access 3 Repos

TL;DR

ProPainter introduces a dual-domain propagation and a mask-guided sparse Transformer to enhance video inpainting, achieving superior performance and efficiency over previous methods by better exploiting global correspondences and reducing redundancy.

Contribution

The paper presents a novel framework combining dual-domain propagation with an efficient sparse Transformer for improved video inpainting.

Findings

01

ProPainter outperforms prior methods by 1.46 dB PSNR.

02

Dual-domain propagation improves global correspondence accuracy.

03

Sparse Transformer reduces computational redundancy.

Abstract

Flow-based propagation and spatiotemporal Transformer are two mainstream mechanisms in video inpainting (VI). Despite the effectiveness of these components, they still suffer from some limitations that affect their performance. Previous propagation-based approaches are performed separately either in the image or feature domain. Global image propagation isolated from learning may cause spatial misalignment due to inaccurate optical flow. Moreover, memory or computational constraints limit the temporal range of feature propagation and video Transformer, preventing exploration of correspondence information from distant frames. To address these issues, we propose an improved framework, called ProPainter, which involves enhanced ProPagation and an efficient Transformer. Specifically, we introduce dual-domain propagation that combines the advantages of image and feature warping, exploiting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Advanced Vision and Imaging

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Byte Pair Encoding · Label Smoothing · Dropout · Absolute Position Encodings · Layer Normalization · Adam