Flow-Guided Transformer for Video Inpainting
Kaidong Zhang, Jingjing Fu, Dong Liu

TL;DR
This paper introduces a flow-guided transformer that uses optical flow information to improve high-fidelity video inpainting by propagating content across frames and selectively guiding attention.
Contribution
It presents a novel flow completion network and a flow-guided transformer with flow reweighting and window partition strategies for efficient, high-quality video inpainting.
Findings
Effective in restoring corrupted video regions
Outperforms existing methods quantitatively and qualitatively
Efficient due to window partition and dual perspective attention
Abstract
We propose a flow-guided transformer, which innovatively leverage the motion discrepancy exposed by optical flows to instruct the attention retrieval in transformer for high fidelity video inpainting. More specially, we design a novel flow completion network to complete the corrupted flows by exploiting the relevant flow features in a local temporal window. With the completed flows, we propagate the content across video frames, and adopt the flow-guided transformer to synthesize the rest corrupted regions. We decouple transformers along temporal and spatial dimension, so that we can easily integrate the locally relevant completed flows to instruct spatial attention only. Furthermore, we design a flow-reweight module to precisely control the impact of completed flows on each spatial transformer. For the sake of efficiency, we introduce window partition strategy to both spatial and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques
