MOGRAS: Human Motion with Grasping in 3D Scenes
By: Kunal Bhosikar , Siddharth Katageri , Vivek Madhavaram and more
Potential Business Impact:
Makes robots grab things better in real places.
Generating realistic full-body motion interacting with objects is critical for applications in robotics, virtual reality, and human-computer interaction. While existing methods can generate full-body motion within 3D scenes, they often lack the fidelity for fine-grained tasks like object grasping. Conversely, methods that generate precise grasping motions typically ignore the surrounding 3D scene. This gap, generating full-body grasping motions that are physically plausible within a 3D scene, remains a significant challenge. To address this, we introduce MOGRAS (Human MOtion with GRAsping in 3D Scenes), a large-scale dataset that bridges this gap. MOGRAS provides pre-grasping full-body walking motions and final grasping poses within richly annotated 3D indoor scenes. We leverage MOGRAS to benchmark existing full-body grasping methods and demonstrate their limitations in scene-aware generation. Furthermore, we propose a simple yet effective method to adapt existing approaches to work seamlessly within 3D scenes. Through extensive quantitative and qualitative experiments, we validate the effectiveness of our dataset and highlight the significant improvements our proposed method achieves, paving the way for more realistic human-scene interactions.
Similar Papers
Learning Physics-Based Full-Body Human Reaching and Grasping from Brief Walking References
Robotics
Creates realistic human movements for robots.
VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models
CV and Pattern Recognition
Makes computer characters move more realistically from videos.
Prime and Reach: Synthesising Body Motion for Gaze-Primed Object Reach
CV and Pattern Recognition
Makes robots reach and grab things like people.