Learnable Gated Temporal Shift Module for Deep Video Inpainting

Ya-Liang Chang; Zhe Yu Liu; Kuan-Ying Lee; Winston Hsu

arXiv:1907.01131·cs.CV·May 14, 2025·52 cites

Learnable Gated Temporal Shift Module for Deep Video Inpainting

Ya-Liang Chang, Zhe Yu Liu, Kuan-Ying Lee, Winston Hsu

PDF

Open Access 2 Repos

TL;DR

This paper introduces LGTSM, a learnable gated temporal shift module that enhances 2D CNNs for video inpainting by efficiently utilizing temporal information without additional parameters, achieving state-of-the-art results.

Contribution

The paper proposes a novel LGTSM component that enables 2D CNNs to effectively incorporate temporal context for video inpainting without increasing model complexity.

Findings

01

Achieves state-of-the-art results on FaceForensics and FVI datasets.

02

Uses only 33% of parameters and inference time of existing methods.

03

Effectively handles arbitrary video masks with improved temporal consistency.

Abstract

How to efficiently utilize temporal information to recover videos in a consistent way is the main issue for video inpainting problems. Conventional 2D CNNs have achieved good performance on image inpainting but often lead to temporally inconsistent results where frames will flicker when applied to videos (see https://www.youtube.com/watch?v=87Vh1HDBjD0&list=PLPoVtv-xp_dL5uckIzz1PKwNjg1yI0I94&index=1); 3D CNNs can capture temporal information but are computationally intensive and hard to train. In this paper, we present a novel component termed Learnable Gated Temporal Shift Module (LGTSM) for video inpainting models that could effectively tackle arbitrary video masks without additional parameters from 3D convolutions. LGTSM is designed to let 2D convolutions make use of neighboring frames more efficiently, which is crucial for video inpainting. Specifically, in each layer, LGTSM learns…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Advanced Vision and Imaging

MethodsGated Linear Unit · 1x1 Convolution · Gated Convolution · Convolution