Learning High-Quality Initial Noise for Single-View Synthesis with Diffusion Models
By: Zhihao Zhang , Xuejun Yang , Weihua Liu and more
Single-view novel view synthesis (NVS) models based on diffusion models have recently attracted increasing attention, as they can generate a series of novel view images from a single image prompt and camera pose information as conditions. It has been observed that in diffusion models, certain high-quality initial noise patterns lead to better generation results than others. However, there remains a lack of dedicated learning frameworks that enable NVS models to learn such high-quality noise. To obtain high-quality initial noise from random Gaussian noise, we make the following contributions. First, we design a discretized Euler inversion method to inject image semantic information into random noise, thereby constructing paired datasets of random and high-quality noise. Second, we propose a learning framework based on an encoder-decoder network (EDN) that directly transforms random noise into high-quality noise. Experiments demonstrate that the proposed EDN can be seamlessly plugged into various NVS models, such as SV3D and MV-Adapter, achieving significant performance improvements across multiple datasets. Code is available at: https://github.com/zhihao0512/EDN.
Similar Papers
DT-NVS: Diffusion Transformers for Novel View Synthesis
CV and Pattern Recognition
Creates new pictures of a scene from one photo.
Sphinx: Efficiently Serving Novel View Synthesis using Regression-Guided Selective Refinement
CV and Pattern Recognition
Makes 3D scenes look real, super fast.
Geometric Consistency Refinement for Single Image Novel View Synthesis via Test-Time Adaptation of Diffusion Models
CV and Pattern Recognition
Makes 3D pictures look real from any angle.