Mobius: Text to Seamless Looping Video Generation via Latent Shift
Xiuli Bi, Jianfei Yuan, Bo Liu, Yong Zhang, Xiaodong Cun, Chi-Man Pun, and Bin Xiao

TL;DR
Mobius is a new method that generates seamless, looping videos directly from text prompts using a pre-trained diffusion model, without additional training or user annotations, enabling dynamic and high-quality visual content.
Contribution
The paper introduces Mobius, a novel latent-shifting technique for text-to-looping video generation that does not require training or image-based appearance constraints.
Findings
Successfully generates seamless looping videos from text prompts.
Outperforms previous methods in visual quality and motion dynamics.
Flexible latent cycle length allows diverse looping effects.
Abstract
We present Mobius, a novel method to generate seamlessly looping videos from text descriptions directly without any user annotations, thereby creating new visual materials for the multi-media presentation. Our method repurposes the pre-trained video latent diffusion model for generating looping videos from text prompts without any training. During inference, we first construct a latent cycle by connecting the starting and ending noise of the videos. Given that the temporal consistency can be maintained by the context of the video diffusion model, we perform multi-frame latent denoising by gradually shifting the first-frame latent to the end in each step. As a result, the denoising context varies in each step while maintaining consistency throughout the inference process. Moreover, the latent cycle in our method can be of any length. This extends our latent-shifting approach to generate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Human Motion and Animation
MethodsDiffusion · Latent Diffusion Model
