Learning to Refocus with Video Diffusion Models
By: SaiKiran Tedla , Zhoutong Zhang , Xuaner Zhang and more
Focus is a cornerstone of photography, yet autofocus systems often fail to capture the intended subject, and users frequently wish to adjust focus after capture. We introduce a novel method for realistic post-capture refocusing using video diffusion models. From a single defocused image, our approach generates a perceptually accurate focal stack, represented as a video sequence, enabling interactive refocusing and unlocking a range of downstream applications. We release a large-scale focal stack dataset acquired under diverse real-world smartphone conditions to support this work and future research. Our method consistently outperforms existing approaches in both perceptual quality and robustness across challenging scenarios, paving the way for more advanced focus-editing capabilities in everyday photography. Code and data are available at www.learn2refocus.github.io
Similar Papers
Generative Refocusing: Flexible Defocus Control from a Single Image
CV and Pattern Recognition
Fix blurry photos and add cool background blur.
FocusDD: Real-World Scene Infusion for Robust Dataset Distillation
CV and Pattern Recognition
Makes big pictures smaller for faster learning.
ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding
CV and Pattern Recognition
Lets computers "see" and understand charts better.