Raformer: Redundancy-Aware Transformer for Video Wire Inpainting

Zhong Ji; Yimu Su; Yan Zhang; Jiacheng Hou; Yanwei Pang; and Jungong; Han

arXiv:2404.15802·cs.CV·April 25, 2024·2 cites

Raformer: Redundancy-Aware Transformer for Video Wire Inpainting

Zhong Ji, Yimu Su, Yan Zhang, Jiacheng Hou, Yanwei Pang, and Jungong, Han

PDF

Open Access 1 Repo

TL;DR

Raformer introduces a redundancy-aware transformer architecture tailored for video wire inpainting, utilizing a new dataset and modules that selectively focus on essential content, significantly improving wire removal performance.

Contribution

The paper presents a novel redundancy-aware transformer model and a new large-scale dataset for effective wire removal in video inpainting, addressing limitations of existing datasets and methods.

Findings

01

Raformer outperforms state-of-the-art methods on multiple datasets.

02

The new WRV2 dataset facilitates better training and evaluation.

03

Redundancy-aware attention improves focus on critical regions.

Abstract

Video Wire Inpainting (VWI) is a prominent application in video inpainting, aimed at flawlessly removing wires in films or TV series, offering significant time and labor savings compared to manual frame-by-frame removal. However, wire removal poses greater challenges due to the wires being longer and slimmer than objects typically targeted in general video inpainting tasks, and often intersecting with people and background objects irregularly, which adds complexity to the inpainting process. Recognizing the limitations posed by existing video wire datasets, which are characterized by their small size, poor quality, and limited variety of scenes, we introduce a new VWI dataset with a novel mask generation strategy, namely Wire Removal Video Dataset 2 (WRV2) and Pseudo Wire-Shaped (PWS) Masks. WRV2 dataset comprises over 4,000 videos with an average length of 80 frames, designed to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Suyimu/WRV2
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Industrial Vision Systems and Defect Detection

MethodsAttention Is All You Need · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Dropout · Dense Connections · Label Smoothing · Residual Connection · Softmax · Inpainting