Category-Aware 3D Object Composition with Disentangled Texture and Shape Multi-view Diffusion
By: Zeren Xiong , Zikun Chen , Zedong Zhang and more
Potential Business Impact:
Creates new 3D objects by mixing two ideas.
In this paper, we tackle a new task of 3D object synthesis, where a 3D model is composited with another object category to create a novel 3D model. However, most existing text/image/3D-to-3D methods struggle to effectively integrate multiple content sources, often resulting in inconsistent textures and inaccurate shapes. To overcome these challenges, we propose a straightforward yet powerful approach, category+3D-to-3D (C33D), for generating novel and structurally coherent 3D models. Our method begins by rendering multi-view images and normal maps from the input 3D model, then generating a novel 2D object using adaptive text-image harmony (ATIH) with the front-view image and a text description from another object category as inputs. To ensure texture consistency, we introduce texture multi-view diffusion, which refines the textures of the remaining multi-view RGB images based on the novel 2D object. For enhanced shape accuracy, we propose shape multi-view diffusion to improve the 2D shapes of both the multi-view RGB images and the normal maps, also conditioned on the novel 2D object. Finally, these outputs are used to reconstruct a complete and novel 3D model. Extensive experiments demonstrate the effectiveness of our method, yielding impressive 3D creations, such as shark(3D)-crocodile(text) in the first row of Fig. 1. A project page is available at: https://xzr52.github.io/C33D/
Similar Papers
Make Your MoVe: Make Your 3D Contents by Adapting Multi-View Diffusion Models to External Editing
CV and Pattern Recognition
Edits 3D objects without messing up their shape.
KineDiff3D: Kinematic-Aware Diffusion for Category-Level Articulated Object Shape Reconstruction and Generation
CV and Pattern Recognition
Builds 3D models of moving objects from one picture.
3D-Consistent Multi-View Editing by Diffusion Guidance
CV and Pattern Recognition
Makes 3D pictures look right after editing.