GimbalDiffusion: Gravity-Aware Camera Control for Video Generation
Fr\'ed\'eric Fortier-Chouinard, Yannick Hold-Geoffroy, Valentin Deschaintre, Matheus Gadelha, Jean-Fran\c{c}ois Lalonde

TL;DR
GimbalDiffusion introduces a gravity-based, absolute coordinate system for precise, interpretable camera control in text-to-video generation, enabling extreme viewpoints and improved guidance.
Contribution
The paper presents GimbalDiffusion, a novel framework that uses gravity as a global reference for absolute camera trajectories, supporting extreme angles and better control.
Findings
Supports full sphere viewpoints including extreme pitch and roll.
Uses null-pitch conditioning to prevent conflicting prompt content.
Proposes new benchmarks for gravity-aware camera control evaluation.
Abstract
Recent progress in text-to-video generation has achieved remarkable realism, yet fine-grained control over camera motion and orientation remains elusive, especially with extreme trajectories (e.g., a 180-degree turnaround, or looking directly up or down). Existing approaches typically encode camera trajectories using relative or ambiguous representations, limiting precise geometric control and offering limited support for large rotations. We introduce GimbalDiffusion, a framework that enables camera control grounded in physical-world coordinates, using gravity as a global reference. Instead of describing motion relative to previous frames, our method defines camera trajectories in an absolute coordinate system, allowing accurate, interpretable control over camera parameters. Using panoramic 360-degree videos for training, we cover the full sphere of possible viewpoints, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
