Controllable Video Generation by Learning the Underlying Dynamical System with Neural ODE
Yucheng Xu, Li Nanbo, Arushi Goel, Zijian Guo, Zonghai Yao, Hamidreza, Kasaei, Mohammadreze Kasaei, Zhibin Li

TL;DR
This paper introduces TiV-ODE, a novel framework using Neural ODEs to generate controllable, dynamic videos from static images and text, effectively modeling complex dynamical systems.
Contribution
The paper proposes a new Neural ODE-based framework for controllable video generation from images and captions, enabling modeling of complex dynamical systems.
Findings
Generated videos are highly controllable and visually consistent.
The method effectively models complex dynamical systems.
Demonstrates the ability to generate videos with desired dynamics and content.
Abstract
Videos depict the change of complex dynamical systems over time in the form of discrete image sequences. Generating controllable videos by learning the dynamical system is an important yet underexplored topic in the computer vision community. This paper presents a novel framework, TiV-ODE, to generate highly controllable videos from a static image and a text caption. Specifically, our framework leverages the ability of Neural Ordinary Differential Equations~(Neural ODEs) to represent complex dynamical systems as a set of nonlinear ordinary differential equations. The resulting framework is capable of generating videos with both desired dynamics and content. Experiments demonstrate the ability of the proposed method in generating highly controllable and visually consistent videos, and its capability of modeling dynamical systems. Overall, this work is a significant step towards…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Model Reduction and Neural Networks
