TL;DR
SymphoMotion is a unified framework for controlling camera motion and object dynamics simultaneously, improving coherence and realism in video generation.
Contribution
It introduces a joint control model with geometry-aware cues and a new dataset, addressing limitations of existing methods that handle only one motion type.
Findings
Outperforms existing methods in visual fidelity and controllability.
Enables depth-aware, spatially coherent object manipulation.
Provides a new dataset for unified motion control in videos.
Abstract
Controlling both camera motion and object dynamics is essential for coherent and expressive video generation, yet current methods typically handle only one motion type or rely on ambiguous 2D cues that entangle camera-induced parallax with true object movement. We present SymphoMotion, a unified motion-control framework that jointly governs camera trajectories and object dynamics within a single model. SymphoMotion features a Camera Trajectory Control mechanism that integrates explicit camera paths with geometry-aware cues to ensure stable, structurally consistent viewpoint transitions, and an Object Dynamics Control mechanism that combines 2D visual guidance with 3D trajectory embeddings to enable depth-aware, spatially coherent object manipulation. To support large-scale training and evaluation, we further construct RealCOD-25K, a comprehensive real-world dataset containing paired…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
