FROMAT: Multiview Material Appearance Transfer via Few-Shot Self-Attention Adaptation
By: Hubert Kompanowski , Varun Jampani , Aaryaman Vasishta and more
Multiview diffusion models have rapidly emerged as a powerful tool for content creation with spatial consistency across viewpoints, offering rich visual realism without requiring explicit geometry and appearance representation. However, compared to meshes or radiance fields, existing multiview diffusion models offer limited appearance manipulation, particularly in terms of material, texture, or style. In this paper, we present a lightweight adaptation technique for appearance transfer in multiview diffusion models. Our method learns to combine object identity from an input image with appearance cues rendered in a separate reference image, producing multi-view-consistent output that reflects the desired materials, textures, or styles. This allows explicit specification of appearance parameters at generation time while preserving the underlying object geometry and view coherence. We leverage three diffusion denoising processes responsible for generating the original object, the reference, and the target images, and perform reverse sampling to aggregate a small subset of layer-wise self-attention features from the object and the reference to influence the target generation. Our method requires only a few training examples to introduce appearance awareness to pretrained multiview models. The experiments show that our method provides a simple yet effective way toward multiview generation with diverse appearance, advocating the adoption of implicit generative 3D representations in practice.
Similar Papers
Make Your MoVe: Make Your 3D Contents by Adapting Multi-View Diffusion Models to External Editing
CV and Pattern Recognition
Edits 3D objects without messing up their shape.
Leveraging Diffusion Models for Stylization using Multiple Style Images
CV and Pattern Recognition
Changes pictures to look like any art style.
3D-Consistent Multi-View Editing by Diffusion Guidance
CV and Pattern Recognition
Makes 3D pictures look right after editing.