DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode
By: Junjia Huang , Pengxiang Yan , Jinhang Cai and more
Potential Business Impact:
Creates realistic pictures from text, layer by layer.
Text-driven image generation using diffusion models has recently gained significant attention. To enable more flexible image manipulation and editing, recent research has expanded from single image generation to transparent layer generation and multi-layer compositions. However, existing approaches often fail to provide a thorough exploration of multi-layer structures, leading to inconsistent inter-layer interactions, such as occlusion relationships, spatial layout, and shadowing. In this paper, we introduce DreamLayer, a novel framework that enables coherent text-driven generation of multiple image layers, by explicitly modeling the relationship between transparent foreground and background layers. DreamLayer incorporates three key components, i.e., Context-Aware Cross-Attention (CACA) for global-local information exchange, Layer-Shared Self-Attention (LSSA) for establishing robust inter-layer connections, and Information Retained Harmonization (IRH) for refining fusion details at the latent level. By leveraging a coherent full-image context, DreamLayer builds inter-layer connections through attention mechanisms and applies a harmonization step to achieve seamless layer fusion. To facilitate research in multi-layer generation, we construct a high-quality, diverse multi-layer dataset including 400k samples. Extensive experiments and user studies demonstrate that DreamLayer generates more coherent and well-aligned layers, with broad applicability, including latent-space image editing and image-to-layer decomposition.
Similar Papers
PSDiffusion: Harmonized Multi-Layer Image Generation via Layout and Appearance Alignment
CV and Pattern Recognition
Creates layered pictures with real-looking shadows.
DreamLight: Towards Harmonious and Consistent Image Relighting
CV and Pattern Recognition
Changes picture lighting to match any background.
DreamFuse: Adaptive Image Fusion with Diffusion Transformer
CV and Pattern Recognition
Makes pictures blend objects naturally into backgrounds.