OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation
Weiqi Li, Shijie Zhao, Chong Mou, Xuhan Sheng, Zhenyu Zhang, Qian, Wang, Junlin Li, Li Zhang, Jian Zhang

TL;DR
OmniDrag introduces a novel method for controllable omnidirectional video generation that enables precise scene- and object-level motion control, overcoming previous limitations in content accuracy and complex spherical motions.
Contribution
It is the first approach to jointly enable scene- and object-level motion control for high-quality omnidirectional image-to-video generation, utilizing a new spherical motion estimator and dataset.
Findings
Outperforms existing methods in control accuracy and video quality.
Effectively handles complex spherical motions with minimal spatial distortion.
Provides a user-friendly drag-style control interface.
Abstract
As virtual reality gains popularity, the demand for controllable creation of immersive and dynamic omnidirectional videos (ODVs) is increasing. While previous text-to-ODV generation methods achieve impressive results, they struggle with content inaccuracies and inconsistencies due to reliance solely on textual inputs. Although recent motion control techniques provide fine-grained control for video generation, directly applying these methods to ODVs often results in spatial distortion and unsatisfactory performance, especially with complex spherical motions. To tackle these challenges, we propose OmniDrag, the first approach enabling both scene- and object-level motion control for accurate, high-quality omnidirectional image-to-video generation. Building on pretrained video diffusion models, we introduce an omnidirectional control module, which is jointly fine-tuned with temporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Medical Image Segmentation Techniques · Digital Image Processing Techniques
MethodsSoftmax · Attention Is All You Need · Diffusion
