Score: 2

Mem-MLP: Real-Time 3D Human Motion Generation from Sparse Inputs

Published: November 20, 2025 | arXiv ID: 2511.16264v1

By: Sinan Mutlu , Georgios F. Angelis , Savas Ozkan and more

BigTech Affiliations: Samsung

Potential Business Impact:

Makes virtual bodies move like real ones.

Business Areas:

Motion Capture Media and Entertainment, Video

Realistic and smooth full-body tracking is crucial for immersive AR/VR applications. Existing systems primarily track head and hands via Head Mounted Devices (HMDs) and controllers, making the 3D full-body reconstruction in-complete. One potential approach is to generate the full-body motions from sparse inputs collected from limited sensors using a Neural Network (NN) model. In this paper, we propose a novel method based on a multi-layer perceptron (MLP) backbone that is enhanced with residual connections and a novel NN-component called Memory-Block. In particular, Memory-Block represents missing sensor data with trainable code-vectors, which are combined with the sparse signals from previous time instances to improve the temporal consistency. Furthermore, we formulate our solution as a multi-task learning problem, allowing our MLP-backbone to learn robust representations that boost accuracy. Our experiments show that our method outperforms state-of-the-art baselines by substantially reducing prediction errors. Moreover, it achieves 72 FPS on mobile HMDs that ultimately improves the accuracy-running time tradeoff.