Score: 0

Canvas3D: Empowering Precise Spatial Control for Image Generation with Constraints from a 3D Virtual Canvas

Published: August 10, 2025 | arXiv ID: 2508.07135v1

By: Runlin Duan , Yuzhao Chen , Rahul Jain and more

Potential Business Impact:

Lets you perfectly place things in AI pictures.

Generative AI (GenAI) has significantly advanced the ease and flexibility of image creation. However, it remains a challenge to precisely control spatial compositions, including object arrangement and scene conditions. To bridge this gap, we propose Canvas3D, an interactive system leveraging a 3D engine to enable precise spatial manipulation for image generation. Upon user prompt, Canvas3D automatically converts textual descriptions into interactive objects within a 3D engine-driven virtual canvas, empowering direct and precise spatial configuration. These user-defined arrangements generate explicit spatial constraints that guide generative models in accurately reflecting user intentions in the resulting images. We conducted a closed-end comparative study between Canvas3D and a baseline system. And an open-ended study to evaluate our system "in the wild". The result indicates that Canvas3D outperforms the baseline on spatial control, interactivity, and overall user experience.