DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing
By: Chenxi Xie , Minghan Li , Shuai Li and more
Potential Business Impact:
Changes pictures using words, more accurately.
Leveraging the powerful generation capability of large-scale pretrained text-to-image models, training-free methods have demonstrated impressive image editing results. Conventional diffusion-based methods, as well as recent rectified flow (RF)-based methods, typically reverse synthesis trajectories by gradually adding noise to clean images, during which the noisy latent at the current timestep is used to approximate that at the next timesteps, introducing accumulated drift and degrading reconstruction accuracy. Considering the fact that in RF the noisy latent is estimated through direct interpolation between Gaussian noises and clean images at each timestep, we propose Direct Noise Alignment (DNA), which directly refines the desired Gaussian noise in the noise domain, significantly reducing the error accumulation in previous methods. Specifically, DNA estimates the velocity field of the interpolated noised latent at each timestep and adjusts the Gaussian noise by computing the difference between the predicted and expected velocity field. We validate the effectiveness of DNA and reveal its relationship with existing RF-based inversion methods. Additionally, we introduce a Mobile Velocity Guidance (MVG) to control the target prompt-guided generation process, balancing image background preservation and target object editability. DNA and MVG collectively constitute our proposed method, namely DNAEdit. Finally, we introduce DNA-Bench, a long-prompt benchmark, to evaluate the performance of advanced image editing models. Experimental results demonstrate that our DNAEdit achieves superior performance to state-of-the-art text-guided editing methods. Codes and benchmark will be available at \href{ https://xiechenxi99.github.io/DNAEdit/}{https://xiechenxi99.github.io/DNAEdit/}.
Similar Papers
Delta Velocity Rectified Flow for Text-to-Image Editing
CV and Pattern Recognition
Changes pictures based on your words better.
FlowAlign: Trajectory-Regularized, Inversion-Free Flow-based Image Editing
CV and Pattern Recognition
Makes AI image editing keep the original look.
Good Noise Makes Good Edits: A Training-Free Diffusion-Based Video Editing with Image and Text Prompts
CV and Pattern Recognition
Changes videos using pictures and words.