From Scan to Action: Leveraging Realistic Scans for Embodied Scene Understanding
By: Anna-Maria Halacheva , Jan-Nico Zaech , Sombit Dey and more
Potential Business Impact:
Makes robots learn and edit real places better.
Real-world 3D scene-level scans offer realism and can enable better real-world generalizability for downstream applications. However, challenges such as data volume, diverse annotation formats, and tool compatibility limit their use. This paper demonstrates a methodology to effectively leverage these scans and their annotations. We propose a unified annotation integration using USD, with application-specific USD flavors. We identify challenges in utilizing holistic real-world scan datasets and present mitigation strategies. The efficacy of our approach is demonstrated through two downstream applications: LLM-based scene editing, enabling effective LLM understanding and adaptation of the data (80% success), and robotic simulation, achieving an 87% success rate in policy learning.
Similar Papers
Generating Actionable Robot Knowledge Bases by Combining 3D Scene Graphs with Robot Ontologies
Robotics
Robots understand their surroundings to make smart choices.
Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding
CV and Pattern Recognition
Teaches computers to understand 3D objects better.
Deep Learning Perspective of Scene Understanding in Autonomous Robots
CV and Pattern Recognition
Helps robots see and understand the world.