ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation
By: Jiahao Chang , Chongjie Ye , Yushuang Wu and more
Potential Business Impact:
Makes 3D models from few pictures.
Existing multi-view 3D object reconstruction methods heavily rely on sufficient overlap between input views, where occlusions and sparse coverage in practice frequently yield severe reconstruction incompleteness. Recent advancements in diffusion-based 3D generative techniques offer the potential to address these limitations by leveraging learned generative priors to hallucinate invisible parts of objects, thereby generating plausible 3D structures. However, the stochastic nature of the inference process limits the accuracy and reliability of generation results, preventing existing reconstruction frameworks from integrating such 3D generative priors. In this work, we comprehensively analyze the reasons why diffusion-based 3D generative methods fail to achieve high consistency, including (a) the insufficiency in constructing and leveraging cross-view connections when extracting multi-view image features as conditions, and (b) the poor controllability of iterative denoising during local detail generation, which easily leads to plausible but inconsistent fine geometric and texture details with inputs. Accordingly, we propose ReconViaGen to innovatively integrate reconstruction priors into the generative framework and devise several strategies that effectively address these issues. Extensive experiments demonstrate that our ReconViaGen can reconstruct complete and accurate 3D models consistent with input views in both global structure and local details.Project page: https://jiahao620.github.io/reconviagen.
Similar Papers
Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints
CV and Pattern Recognition
Helps robots see and grab objects better.
SparseRecon: Neural Implicit Surface Reconstruction from Sparse Views with Feature and Depth Consistencies
CV and Pattern Recognition
Builds 3D shapes from just a few pictures.
Improving Multi-View Reconstruction via Texture-Guided Gaussian-Mesh Joint Optimization
CV and Pattern Recognition
Creates realistic 3D models from pictures.