ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing Points
By: Ryota Okumura, Kaede Shiohara, Toshihiko Yamasaki
Potential Business Impact:
Fixes wonky lines in AI pictures.
Recent text-to-image models, such as Stable Diffusion, have achieved impressive visual quality, yet they often suffer from geometric inconsistencies that undermine the structural realism of generated scenes. One prominent issue is vanishing point inconsistency, where projections of parallel lines fail to converge correctly in 2D space. This leads to structurally implausible geometry that degrades spatial realism, especially in architectural scenes. We propose ControlVP, a user-guided framework for correcting vanishing point inconsistencies in generated images. Our approach extends a pre-trained diffusion model by incorporating structural guidance derived from building contours. We also introduce geometric constraints that explicitly encourage alignment between image edges and perspective cues. Our method enhances global geometric consistency while maintaining visual fidelity comparable to the baselines. This capability is particularly valuable for applications that require accurate spatial structure, such as image-to-3D reconstruction. The dataset and source code are available at https://github.com/RyotaOkumura/ControlVP .
Similar Papers
3D-Consistent Multi-View Editing by Diffusion Guidance
CV and Pattern Recognition
Makes 3D pictures look right after editing.
Epipolar Geometry Improves Video Generation Models
CV and Pattern Recognition
Makes videos look real by fixing shaky camera moves.
CtrlVDiff: Controllable Video Generation via Unified Multimodal Video Diffusion
CV and Pattern Recognition
Makes videos change appearance and content easily.