Score: 0

SatDreamer360: Geometry Consistent Street-View Video Generation from Satellite Imagery

Published: May 31, 2025 | arXiv ID: 2506.00600v1

By: Xianghui Ze , Beiyi Zhu , Zhenbo Song and more

Potential Business Impact:

Makes satellite pictures look like real street videos.

Business Areas:
Geospatial Data and Analytics, Navigation and Mapping

Generating continuous ground-level video from satellite imagery is a challenging task with significant potential for applications in simulation, autonomous navigation, and digital twin cities. Existing approaches primarily focus on synthesizing individual ground-view images, often relying on auxiliary inputs like height maps or handcrafted projections, and fall short in producing temporally consistent sequences. In this paper, we propose {SatDreamer360}, a novel framework that generates geometrically and temporally consistent ground-view video from a single satellite image and a predefined trajectory. To bridge the large viewpoint gap, we introduce a compact tri-plane representation that encodes scene geometry directly from the satellite image. A ray-based pixel attention mechanism retrieves view-dependent features from the tri-plane, enabling accurate cross-view correspondence without requiring additional geometric priors. To ensure multi-frame consistency, we propose an epipolar-constrained temporal attention module that aligns features across frames using the known relative poses along the trajectory. To support evaluation, we introduce {VIGOR++}, a large-scale dataset for cross-view video generation, with dense trajectory annotations and high-quality ground-view sequences. Extensive experiments demonstrate that SatDreamer360 achieves superior performance in fidelity, coherence, and geometric alignment across diverse urban scenes.

Country of Origin
🇨🇳 China

Page Count
13 pages

Category
Computer Science:
CV and Pattern Recognition