Avatar4D: Synthesizing Domain-Specific 4D Humans for Real-World Pose Estimation
By: Jerrin Bright , Zhibo Wang , Dmytro Klepachevskyi and more
We present Avatar4D, a real-world transferable pipeline for generating customizable synthetic human motion datasets tailored to domain-specific applications. Unlike prior works, which focus on general, everyday motions and offer limited flexibility, our approach provides fine-grained control over body pose, appearance, camera viewpoint, and environmental context, without requiring any manual annotations. To validate the impact of Avatar4D, we focus on sports, where domain-specific human actions and movement patterns pose unique challenges for motion understanding. In this setting, we introduce Syn2Sport, a large-scale synthetic dataset spanning sports, including baseball and ice hockey. Avatar4D features high-fidelity 4D (3D geometry over time) human motion sequences with varying player appearances rendered in diverse environments. We benchmark several state-of-the-art pose estimation models on Syn2Sport and demonstrate their effectiveness for supervised learning, zero-shot transfer to real-world data, and generalization across sports. Furthermore, we evaluate how closely the generated synthetic data aligns with real-world datasets in feature space. Our results highlight the potential of such systems to generate scalable, controllable, and transferable human datasets for diverse domain-specific tasks without relying on domain-specific real data.
Similar Papers
Gen4D: Synthesizing Humans and Scenes in the Wild
Graphics
Creates realistic sports videos for AI training.
MVP4D: Multi-View Portrait Video Diffusion for Animatable 4D Avatars
CV and Pattern Recognition
Makes digital people move realistically from one photo.
Bringing Your Portrait to 3D Presence
CV and Pattern Recognition
Turns one photo into a moving 3D person.