GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces
By: Melis Ocal , Xiaoyan Xing , Yue Li and more
Potential Business Impact:
Changes 3D objects with text, instantly.
3D stylization is central to game development, virtual reality, and digital arts, where the demand for diverse assets calls for scalable methods that support fast, high-fidelity manipulation. Existing text-to-3D stylization methods typically distill from 2D image editors, requiring time-intensive per-asset optimization and exhibiting multi-view inconsistency due to the limitations of current text-to-image models, which makes them impractical for large-scale production. In this paper, we introduce GaussianBlender, a pioneering feed-forward framework for text-driven 3D stylization that performs edits instantly at inference. Our method learns structured, disentangled latent spaces with controlled information sharing for geometry and appearance from spatially-grouped 3D Gaussians. A latent diffusion model then applies text-conditioned edits on these learned representations. Comprehensive evaluations show that GaussianBlender not only delivers instant, high-fidelity, geometry-preserving, multi-view consistent stylization, but also surpasses methods that require per-instance test-time optimization - unlocking practical, democratized 3D stylization at scale.
Similar Papers
3D-LATTE: Latent Space 3D Editing from Textual Instructions
Graphics
Changes 3D shapes with text instructions.
3D-LATTE: Latent Space 3D Editing from Textual Instructions
Graphics
Changes 3D objects with words, not just pictures.
SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer
CV and Pattern Recognition
Makes 3D objects look like any picture.