A Scalable Attention-Based Approach for Image-to-3D Texture Mapping
By: Arianna Rampini , Kanika Madan , Bruno Roy and more
Potential Business Impact:
Makes 3D objects look real from one picture.
High-quality textures are critical for realistic 3D content creation, yet existing generative methods are slow, rely on UV maps, and often fail to remain faithful to a reference image. To address these challenges, we propose a transformer-based framework that predicts a 3D texture field directly from a single image and a mesh, eliminating the need for UV mapping and differentiable rendering, and enabling faster texture generation. Our method integrates a triplane representation with depth-based backprojection losses, enabling efficient training and faster inference. Once trained, it generates high-fidelity textures in a single forward pass, requiring only 0.2s per shape. Extensive qualitative, quantitative, and user preference evaluations demonstrate that our method outperforms state-of-the-art baselines on single-image texture reconstruction in terms of both fidelity to the input image and perceptual quality, highlighting its practicality for scalable, high-quality, and controllable 3D content creation.
Similar Papers
CaliTex: Geometry-Calibrated Attention for View-Coherent 3D Texture Generation
CV and Pattern Recognition
Makes 3D objects look real from every angle.
Enhancing Monocular 3D Hand Reconstruction with Learned Texture Priors
CV and Pattern Recognition
Makes computer hands look more real and accurate.
TEXTRIX: Latent Attribute Grid for Native Texture Generation and Beyond
CV and Pattern Recognition
Makes 3D models look real and helps computers understand them.