Epipolar Geometry Improves Video Generation Models

Orest Kupyn; Fabian Manhardt; Federico Tombari; Christian Rupprecht

arXiv:2510.21615·cs.CV·October 27, 2025

Epipolar Geometry Improves Video Generation Models

Orest Kupyn, Fabian Manhardt, Federico Tombari, Christian Rupprecht

PDF

Open Access

TL;DR

This paper introduces a method that incorporates epipolar geometry constraints into video diffusion models, significantly improving their geometric consistency and stability without sacrificing visual quality.

Contribution

It presents a novel approach that aligns diffusion models with classical geometric principles using preference-based optimization, enhancing 3D consistency in video generation.

Findings

01

Classical geometric constraints outperform learned metrics in stability.

02

The method generalizes well from static scenes to dynamic content.

03

Geometric enforcement improves visual and spatial consistency.

Abstract

Video generation models have progressed tremendously through large latent diffusion transformers trained with rectified flow techniques. Yet these models still struggle with geometric inconsistencies, unstable motion, and visual artifacts that break the illusion of realistic 3D scenes. 3D-consistent video generation could significantly impact numerous downstream applications in generation and reconstruction tasks. We explore how epipolar geometry constraints improve modern video diffusion models. Despite massive training data, these models fail to capture fundamental geometric principles underlying visual content. We align diffusion models using pairwise epipolar geometry constraints via preference-based optimization, directly addressing unstable camera trajectories and geometric artifacts through mathematically principled geometric enforcement. Our approach efficiently enforces…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Advanced Vision and Imaging