Score: 0

One-Shot Refiner: Boosting Feed-forward Novel View Synthesis via One-Step Diffusion

Published: January 20, 2026 | arXiv ID: 2601.14161v1

By: Yitong Dong , Qi Zhang , Minchao Jiang and more

Potential Business Impact:

Makes blurry pictures sharp and clear.

Business Areas:
Image Recognition Data and Analytics, Software

We present a novel framework for high-fidelity novel view synthesis (NVS) from sparse images, addressing key limitations in recent feed-forward 3D Gaussian Splatting (3DGS) methods built on Vision Transformer (ViT) backbones. While ViT-based pipelines offer strong geometric priors, they are often constrained by low-resolution inputs due to computational costs. Moreover, existing generative enhancement methods tend to be 3D-agnostic, resulting in inconsistent structures across views, especially in unseen regions. To overcome these challenges, we design a Dual-Domain Detail Perception Module, which enables handling high-resolution images without being limited by the ViT backbone, and endows Gaussians with additional features to store high-frequency details. We develop a feature-guided diffusion network, which can preserve high-frequency details during the restoration process. We introduce a unified training strategy that enables joint optimization of the ViT-based geometric backbone and the diffusion-based refinement module. Experiments demonstrate that our method can maintain superior generation quality across multiple datasets.

Country of Origin
🇨🇳 China

Page Count
9 pages

Category
Computer Science:
CV and Pattern Recognition