MyGo: Consistent and Controllable Multi-View Driving Video Generation with Camera Control
Yining Yao, Xi Guo, Chenjing Ding, Wei Wu

TL;DR
MyGo is an innovative framework for generating multi-view driving videos with enhanced camera control and spatial-temporal consistency, advancing autonomous driving simulation capabilities.
Contribution
It introduces camera motion conditions into a pre-trained diffusion model and employs epipolar constraints for improved multi-view consistency.
Findings
Achieved state-of-the-art results in camera-controlled video generation.
Enhanced multi-view driving video generation quality.
Demonstrated improved spatial-temporal consistency in generated videos.
Abstract
High-quality driving video generation is crucial for providing training data for autonomous driving models. However, current generative models rarely focus on enhancing camera motion control under multi-view tasks, which is essential for driving video generation. Therefore, we propose MyGo, an end-to-end framework for video generation, introducing motion of onboard cameras as conditions to make progress in camera controllability and multi-view consistency. MyGo employs additional plug-in modules to inject camera parameters into the pre-trained video diffusion model, which retains the extensive knowledge of the pre-trained model as much as possible. Furthermore, we use epipolar constraints and neighbor view information during the generation process of each view to enhance spatial-temporal consistency. Experimental results show that MyGo has achieved state-of-the-art results in both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Advanced Optical Imaging Technologies
MethodsDiffusion · Focus
