Diffusion-based G-buffer generation and rendering
By: Bowen Xue , Giuseppe Claudio Guarnera , Shuang Zhao and more
Potential Business Impact:
Lets you change pictures made by computers.
Despite recent advances in text-to-image generation, controlling geometric layout and material properties in synthesized scenes remains challenging. We present a novel pipeline that first produces a G-buffer (albedo, normals, depth, roughness, and metallic) from a text prompt and then renders a final image through a modular neural network. This intermediate representation enables fine-grained editing: users can copy and paste within specific G-buffer channels to insert or reposition objects, or apply masks to the irradiance channel to adjust lighting locally. As a result, real objects can be seamlessly integrated into virtual scenes, and virtual objects can be placed into real environments with high fidelity. By separating scene decomposition from image rendering, our method offers a practical balance between detailed post-generation control and efficient text-driven synthesis. We demonstrate its effectiveness on a variety of examples, showing that G-buffer editing significantly extends the flexibility of text-guided image generation.
Similar Papers
FrameDiffuser: G-Buffer-Conditioned Diffusion for Neural Forward Frame Rendering
CV and Pattern Recognition
Makes game graphics look real and smooth.
DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models
CV and Pattern Recognition
Makes computer pictures look real from videos.
Real-time Global Illumination for Dynamic 3D Gaussian Scenes
Graphics
Makes 3D worlds look real, even when moving.