VORNet: Spatio-temporally Consistent Video Inpainting for Object Removal
Ya-Liang Chang, Zhe Yu Liu, Winston Hsu

TL;DR
VORNet is a novel deep learning framework that achieves spatio-temporally consistent video object removal by integrating optical flow warping with image inpainting, improving over existing methods.
Contribution
The paper introduces VORNet, a new learning-based approach that ensures spatial and temporal consistency in video object removal, addressing limitations of previous image-based inpainting methods.
Findings
VORNet outperforms existing methods in spatial and temporal consistency.
Experiments on the SVOR dataset show improved visual quality.
Both objective and subjective evaluations favor VORNet.
Abstract
Video object removal is a challenging task in video processing that often requires massive human efforts. Given the mask of the foreground object in each frame, the goal is to complete (inpaint) the object region and generate a video without the target object. While recently deep learning based methods have achieved great success on the image inpainting task, they often lead to inconsistent results between frames when applied to videos. In this work, we propose a novel learning-based Video Object Removal Network (VORNet) to solve the video object removal task in a spatio-temporally consistent manner, by combining the optical flow warping and image-based inpainting model. Experiments are done on our Synthesized Video Object Removal (SVOR) dataset based on the YouTube-VOS video segmentation dataset, and both the objective and subjective evaluation demonstrate that our VORNet generates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Digital Media Forensic Detection
