Click to Move: Controlling Video Generation with Sparse Motion

Pierfrancesco Ardino; Marco De Nadai; Bruno Lepri; Elisa Ricci and; St\'ephane Lathuili\`ere

arXiv:2108.08815·cs.CV·August 20, 2021

Click to Move: Controlling Video Generation with Sparse Motion

Pierfrancesco Ardino, Marco De Nadai, Bruno Lepri, Elisa Ricci and, St\'ephane Lathuili\`ere

PDF

Open Access 1 Repo

TL;DR

This paper presents Click to Move (C2M), a framework enabling user-controlled video generation through sparse motion inputs, utilizing a GCN to model object interactions and outperform existing methods.

Contribution

Introducing a novel GCN-based architecture for user-controlled video synthesis with sparse motion inputs and holistic object interaction modeling.

Findings

01

C2M outperforms existing methods on benchmark datasets.

02

The GCN effectively models object interactions in scene motion.

03

User control via sparse clicks produces realistic video sequences.

Abstract

This paper introduces Click to Move (C2M), a novel framework for video generation where the user can control the motion of the synthesized video through mouse clicks specifying simple object trajectories of the key objects in the scene. Our model receives as input an initial frame, its corresponding segmentation map and the sparse motion vectors encoding the input provided by the user. It outputs a plausible video sequence starting from the given frame and with a motion that is consistent with user input. Notably, our proposed deep architecture incorporates a Graph Convolution Network (GCN) modelling the movements of all the objects in the scene in a holistic manner and effectively combining the sparse user motion information and image features. Experimental results show that C2M outperforms existing methods on two publicly available datasets, thus demonstrating the effectiveness of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pierfrancescoardino/c2m
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Human Pose and Action Recognition

MethodsConvolution · Graph Convolutional Network