Score: 1

Mask-Conditioned Voxel Diffusion for Joint Geometry and Color Inpainting

Published: January 1, 2026 | arXiv ID: 2601.00368v1

By: Aarya Sumuk

BigTech Affiliations: Stanford University

Potential Business Impact:

Fixes broken 3D objects by filling in missing parts.

Business Areas:

Image Recognition Data and Analytics, Software

We present a lightweight two-stage framework for joint geometry and color inpainting of damaged 3D objects, motivated by the digital restoration of cultural heritage artifacts. The pipeline separates damage localization from reconstruction. In the first stage, a 2D convolutional network predicts damage masks on RGB slices extracted from a voxelized object, and these predictions are aggregated into a volumetric mask. In the second stage, a diffusion-based 3D U-Net performs mask-conditioned inpainting directly on voxel grids, reconstructing geometry and color while preserving observed regions. The model jointly predicts occupancy and color using a composite objective that combines occupancy reconstruction with masked color reconstruction and perceptual regularization. We evaluate the approach on a curated set of textured artifacts with synthetically generated damage using standard geometric and color metrics. Compared to symmetry-based baselines, our method produces more complete geometry and more coherent color reconstructions at a fixed 32^3 resolution. Overall, the results indicate that explicit mask conditioning is a practical way to guide volumetric diffusion models for joint 3D geometry and color inpainting.