SemLayoutDiff: Semantic Layout Generation with Diffusion Model for Indoor Scene Synthesis
By: Xiaohao Sun, Divyam Goel, Angel X. Chang
Potential Business Impact:
Creates realistic 3D rooms with furniture.
We present SemLayoutDiff, a unified model for synthesizing diverse 3D indoor scenes across multiple room types. The model introduces a scene layout representation combining a top-down semantic map and attributes for each object. Unlike prior approaches, which cannot condition on architectural constraints, SemLayoutDiff employs a categorical diffusion model capable of conditioning scene synthesis explicitly on room masks. It first generates a coherent semantic map, followed by a cross-attention-based network to predict furniture placements that respect the synthesized layout. Our method also accounts for architectural elements such as doors and windows, ensuring that generated furniture arrangements remain practical and unobstructed. Experiments on the 3D-FRONT dataset show that SemLayoutDiff produces spatially coherent, realistic, and varied scenes, outperforming previous methods.
Similar Papers
SemLayoutDiff: Semantic Layout Generation with Diffusion Model for Indoor Scene Synthesis
Graphics
Builds realistic rooms with furniture and doors.
DisCo-Layout: Disentangling and Coordinating Semantic and Physical Refinement in a Multi-Agent Framework for 3D Indoor Layout Synthesis
Robotics
Builds realistic virtual rooms that work anywhere.
CymbaDiff: Structured Spatial Diffusion for Sketch-based 3D Semantic Urban Scene Generation
CV and Pattern Recognition
Creates 3D worlds from simple drawings.