InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse Problem
By: Yeobin Hong , Suhyeon Lee , Hyungjin Chung and more
Recent approaches to controllable 4D video generation often rely on fine-tuning pre-trained Video Diffusion Models (VDMs). This dominant paradigm is computationally expensive, requiring large-scale datasets and architectural modifications, and frequently suffers from catastrophic forgetting of the model's original generative priors. Here, we propose InverseCrafter, an efficient inpainting inverse solver that reformulates the 4D generation task as an inpainting problem solved in the latent space. The core of our method is a principled mechanism to encode the pixel space degradation operator into a continuous, multi-channel latent mask, thereby bypassing the costly bottleneck of repeated VAE operations and backpropagation. InverseCrafter not only achieves comparable novel view generation and superior measurement consistency in camera control tasks with near-zero computational overhead, but also excels at general-purpose video inpainting with editing. Code is available at https://github.com/yeobinhong/InverseCrafter.
Similar Papers
InstantViR: Real-Time Video Inverse Problem Solver with Distilled Diffusion Prior
CV and Pattern Recognition
Restores blurry videos instantly for streaming.
LVTINO: LAtent Video consisTency INverse sOlver for High Definition Video Restoration
CV and Pattern Recognition
Fixes blurry videos, keeping them smooth.
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
Graphics
Builds 3D worlds from videos accurately.