Score: 1

HumanRAM: Feed-forward Human Reconstruction and Animation Model using Transformers

Published: June 3, 2025 | arXiv ID: 2506.03118v1

By: Zhiyuan Yu , Zhe Li , Hujun Bao and more

Potential Business Impact:

Makes 3D people from few pictures.

Business Areas:

Virtual Reality Hardware, Software

3D human reconstruction and animation are long-standing topics in computer graphics and vision. However, existing methods typically rely on sophisticated dense-view capture and/or time-consuming per-subject optimization procedures. To address these limitations, we propose HumanRAM, a novel feed-forward approach for generalizable human reconstruction and animation from monocular or sparse human images. Our approach integrates human reconstruction and animation into a unified framework by introducing explicit pose conditions, parameterized by a shared SMPL-X neural texture, into transformer-based large reconstruction models (LRM). Given monocular or sparse input images with associated camera parameters and SMPL-X poses, our model employs scalable transformers and a DPT-based decoder to synthesize realistic human renderings under novel viewpoints and novel poses. By leveraging the explicit pose conditions, our model simultaneously enables high-quality human reconstruction and high-fidelity pose-controlled animation. Experiments show that HumanRAM significantly surpasses previous methods in terms of reconstruction accuracy, animation fidelity, and generalization performance on real-world datasets. Video results are available at https://zju3dv.github.io/humanram/.

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds

CV and Pattern Recognition

Creates realistic 3D people from one picture.

13 Mar 2025 1

89%

PF-LHM: 3D Animatable Avatar Reconstruction from Pose-free Articulated Human Images

CV and Pattern Recognition

Creates 3D people from photos for games.

16 Jun 2025 0

88%

HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration

CV and Pattern Recognition

Makes 3D people from one picture.

4 Apr 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 🇭🇰 China, Hong Kong

Page Count

13 pages

HumanRAM: Feed-forward Human Reconstruction and Animation Model using Transformers

Makes 3D people from few pictures.

Technical Abstract

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds

PF-LHM: 3D Animatable Avatar Reconstruction from Pose-free Articulated Human Images

HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration