Score: 0

Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Published: March 26, 2025 | arXiv ID: 2503.20785v1

By: Tianqi Liu , Zihao Huang , Zhaoxi Chen and more

Potential Business Impact:

Makes one picture into a moving 3D scene.

Business Areas:

3D Technology Hardware, Software

We present Free4D, a novel tuning-free framework for 4D scene generation from a single image. Existing methods either focus on object-level generation, making scene-level generation infeasible, or rely on large-scale multi-view video datasets for expensive training, with limited generalization ability due to the scarcity of 4D scene data. In contrast, our key insight is to distill pre-trained foundation models for consistent 4D scene representation, which offers promising advantages such as efficiency and generalizability. 1) To achieve this, we first animate the input image using image-to-video diffusion models followed by 4D geometric structure initialization. 2) To turn this coarse structure into spatial-temporal consistent multiview videos, we design an adaptive guidance mechanism with a point-guided denoising strategy for spatial consistency and a novel latent replacement strategy for temporal coherence. 3) To lift these generated observations into consistent 4D representation, we propose a modulation-based refinement to mitigate inconsistencies while fully leveraging the generated information. The resulting 4D representation enables real-time, controllable rendering, marking a significant advancement in single-image-based 4D scene generation.

PSF-4D: A Progressive Sampling Framework for View Consistent 4D Editing

CV and Pattern Recognition

Changes videos to look like drawings.

14 Mar 2025 0

89%

SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation

CV and Pattern Recognition

Makes 3D objects move realistically from videos.

20 Mar 2025 0

89%

Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding

CV and Pattern Recognition

Makes videos show 3D worlds without flickering.

3 Dec 2025 0

View PDF Login to Bookmark

Page Count

14 pages

Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Makes one picture into a moving 3D scene.

Technical Abstract

PSF-4D: A Progressive Sampling Framework for View Consistent 4D Editing

SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation

Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding