FlexPainter: Flexible and Multi-View Consistent Texture Generation
By: Dongyu Yan , Leyi Wu , Jiantao Lin and more
Potential Business Impact:
Makes 3D models look real with better textures.
Texture map production is an important part of 3D modeling and determines the rendering quality. Recently, diffusion-based methods have opened a new way for texture generation. However, restricted control flexibility and limited prompt modalities may prevent creators from producing desired results. Furthermore, inconsistencies between generated multi-view images often lead to poor texture generation quality. To address these issues, we introduce \textbf{FlexPainter}, a novel texture generation pipeline that enables flexible multi-modal conditional guidance and achieves highly consistent texture generation. A shared conditional embedding space is constructed to perform flexible aggregation between different input modalities. Utilizing such embedding space, we present an image-based CFG method to decompose structural and style information, achieving reference image-based stylization. Leveraging the 3D knowledge within the image diffusion prior, we first generate multi-view images simultaneously using a grid representation to enhance global understanding. Meanwhile, we propose a view synchronization and adaptive weighting module during diffusion sampling to further ensure local consistency. Finally, a 3D-aware texture completion model combined with a texture enhancement model is used to generate seamless, high-resolution texture maps. Comprehensive experiments demonstrate that our framework significantly outperforms state-of-the-art methods in both flexibility and generation quality.
Similar Papers
MVPainter: Accurate and Detailed 3D Texture Generation via Multi-View Diffusion with Geometric Control
CV and Pattern Recognition
Makes 3D objects look real with detailed colors.
CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation
CV and Pattern Recognition
Makes 3D objects from words in seconds.
A Scalable Attention-Based Approach for Image-to-3D Texture Mapping
CV and Pattern Recognition
Makes 3D objects look real from one picture.