Fine-grained Defocus Blur Control for Generative Image Models
By: Ayush Shrivastava , Connelly Barnes , Xuaner Zhang and more
Potential Business Impact:
Makes pictures blurry like real cameras.
Current text-to-image diffusion models excel at generating diverse, high-quality images, yet they struggle to incorporate fine-grained camera metadata such as precise aperture settings. In this work, we introduce a novel text-to-image diffusion framework that leverages camera metadata, or EXIF data, which is often embedded in image files, with an emphasis on generating controllable lens blur. Our method mimics the physical image formation process by first generating an all-in-focus image, estimating its monocular depth, predicting a plausible focus distance with a novel focus distance transformer, and then forming a defocused image with an existing differentiable lens blur model. Gradients flow backwards through this whole process, allowing us to learn without explicit supervision to generate defocus effects based on content elements and the provided EXIF data. At inference time, this enables precise interactive user control over defocus effects while preserving scene contents, which is not achievable with existing diffusion models. Experimental results demonstrate that our model enables superior fine-grained control without altering the depicted scene.
Similar Papers
Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models
Graphics
Makes AI pictures look like real photos.
BokehDiff: Neural Lens Blur with One-Step Diffusion
CV and Pattern Recognition
Makes blurry photos look real and clear.
LightLab: Controlling Light Sources in Images with Diffusion Models
CV and Pattern Recognition
Changes light in pictures with simple controls.