PSDiffusion: Harmonized Multi-Layer Image Generation via Layout and Appearance Alignment
By: Dingbang Huang , Wenbo Li , Yifei Zhao and more
Potential Business Impact:
Creates layered pictures with real-looking shadows.
Diffusion models have made remarkable advancements in generating high-quality images from textual descriptions. Recent works like LayerDiffuse have extended the previous single-layer, unified image generation paradigm to transparent image layer generation. However, existing multi-layer generation methods fail to handle the interactions among multiple layers such as rational global layout, physics-plausible contacts and visual effects like shadows and reflections while maintaining high alpha quality. To solve this problem, we propose PSDiffusion, a unified diffusion framework for simultaneous multi-layer text-to-image generation. Our model can automatically generate multi-layer images with one RGB background and multiple RGBA foregrounds through a single feed-forward process. Unlike existing methods that combine multiple tools for post-decomposition or generate layers sequentially and separately, our method introduces a global-layer interactive mechanism that generates layered-images concurrently and collaboratively, ensuring not only high quality and completeness for each layer, but also spatial and visual interactions among layers for global coherence.
Similar Papers
DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode
CV and Pattern Recognition
Creates realistic pictures from text, layer by layer.
OmniPSD: Layered PSD Generation with Diffusion Transformer
CV and Pattern Recognition
Turns flat pictures into editable layered designs.
DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models
CV and Pattern Recognition
Creates pictures from words for designs.