Score: 0

Epipolar Geometry Improves Video Generation Models

Published: October 24, 2025 | arXiv ID: 2510.21615v1

By: Orest Kupyn , Fabian Manhardt , Federico Tombari and more

Potential Business Impact:

Makes videos look real by fixing shaky camera moves.

Business Areas:

Motion Capture Media and Entertainment, Video

Video generation models have progressed tremendously through large latent diffusion transformers trained with rectified flow techniques. Yet these models still struggle with geometric inconsistencies, unstable motion, and visual artifacts that break the illusion of realistic 3D scenes. 3D-consistent video generation could significantly impact numerous downstream applications in generation and reconstruction tasks. We explore how epipolar geometry constraints improve modern video diffusion models. Despite massive training data, these models fail to capture fundamental geometric principles underlying visual content. We align diffusion models using pairwise epipolar geometry constraints via preference-based optimization, directly addressing unstable camera trajectories and geometric artifacts through mathematically principled geometric enforcement. Our approach efficiently enforces geometric principles without requiring end-to-end differentiability. Evaluation demonstrates that classical geometric constraints provide more stable optimization signals than modern learned metrics, which produce noisy targets that compromise alignment quality. Training on static scenes with dynamic cameras ensures high-quality measurements while the model generalizes effectively to diverse dynamic content. By bridging data-driven deep learning with classical geometric computer vision, we present a practical method for generating spatially consistent videos without compromising visual quality.

GeoVideo: Introducing Geometric Regularization into Video Generation Model

CV and Pattern Recognition

Makes videos look real and move smoothly.

3 Dec 2025 0

89%

GeoWorld: Unlocking the Potential of Geometry Models to Facilitate High-Fidelity 3D Scene Generation

CV and Pattern Recognition

Creates realistic 3D worlds from pictures.

28 Nov 2025 0

89%

Geometric Consistency Refinement for Single Image Novel View Synthesis via Test-Time Adaptation of Diffusion Models

CV and Pattern Recognition

Makes 3D pictures look real from any angle.

11 Apr 2025 1

View PDF Login to Bookmark

Page Count

16 pages

Epipolar Geometry Improves Video Generation Models

Makes videos look real by fixing shaky camera moves.

Technical Abstract

GeoVideo: Introducing Geometric Regularization into Video Generation Model

GeoWorld: Unlocking the Potential of Geometry Models to Facilitate High-Fidelity 3D Scene Generation

Geometric Consistency Refinement for Single Image Novel View Synthesis via Test-Time Adaptation of Diffusion Models