ART: Articulated Reconstruction Transformer
By: Zizhang Li , Cheng Zhang , Zhengqin Li and more
Potential Business Impact:
Builds 3D models of moving objects from pictures.
We introduce ART, Articulated Reconstruction Transformer -- a category-agnostic, feed-forward model that reconstructs complete 3D articulated objects from only sparse, multi-state RGB images. Previous methods for articulated object reconstruction either rely on slow optimization with fragile cross-state correspondences or use feed-forward models limited to specific object categories. In contrast, ART treats articulated objects as assemblies of rigid parts, formulating reconstruction as part-based prediction. Our newly designed transformer architecture maps sparse image inputs to a set of learnable part slots, from which ART jointly decodes unified representations for individual parts, including their 3D geometry, texture, and explicit articulation parameters. The resulting reconstructions are physically interpretable and readily exportable for simulation. Trained on a large-scale, diverse dataset with per-part supervision, and evaluated across diverse benchmarks, ART achieves significant improvements over existing baselines and establishes a new state of the art for articulated object reconstruction from image inputs.
Similar Papers
ArtiLatent: Realistic Articulated 3D Object Generation via Structured Latents
CV and Pattern Recognition
Creates realistic 3D objects that can move.
ArtGen: Conditional Generative Modeling of Articulated Objects in Arbitrary Part-Level States
CV and Pattern Recognition
Creates 3D objects that move realistically.
Particulate: Feed-Forward 3D Object Articulation
CV and Pattern Recognition
Makes 3D objects move like real toys.