Bridging Geometry-Coherent Text-to-3D Generation with Multi-View Diffusion Priors and Gaussian Splatting
By: Feng Yang , Wenliang Qian , Wangmeng Zuo and more
Potential Business Impact:
Makes 3D pictures from words more real.
Score Distillation Sampling (SDS) leverages pretrained 2D diffusion models to advance text-to-3D generation but neglects multi-view correlations, being prone to geometric inconsistencies and multi-face artifacts in the generated 3D content. In this work, we propose Coupled Score Distillation (CSD), a framework that couples multi-view joint distribution priors to ensure geometrically consistent 3D generation while enabling the stable and direct optimization of 3D Gaussian Splatting. Specifically, by reformulating the optimization as a multi-view joint optimization problem, we derive an effective optimization rule that effectively couples multi-view priors to guide optimization across different viewpoints while preserving the diversity of generated 3D assets. Additionally, we propose a framework that directly optimizes 3D Gaussian Splatting (3D-GS) with random initialization to generate geometrically consistent 3D content. We further employ a deformable tetrahedral grid, initialized from 3D-GS and refined through CSD, to produce high-quality, refined meshes. Quantitative and qualitative experimental results demonstrate the efficiency and competitive quality of our approach.
Similar Papers
Rethinking Score Distilling Sampling for 3D Editing and Generation
CV and Pattern Recognition
Makes 3D models from text, and changes them.
Text-to-3D Generation using Jensen-Shannon Score Distillation
CV and Pattern Recognition
Creates better 3D pictures from words.
Consistent Flow Distillation for Text-to-3D Generation
CV and Pattern Recognition
Makes 3D pictures from words better.