Score: 0

From Scan to Action: Leveraging Realistic Scans for Embodied Scene Understanding

Published: July 23, 2025 | arXiv ID: 2507.17585v1

By: Anna-Maria Halacheva , Jan-Nico Zaech , Sombit Dey and more

Potential Business Impact:

Makes robots learn and edit real places better.

Business Areas:
Image Recognition Data and Analytics, Software

Real-world 3D scene-level scans offer realism and can enable better real-world generalizability for downstream applications. However, challenges such as data volume, diverse annotation formats, and tool compatibility limit their use. This paper demonstrates a methodology to effectively leverage these scans and their annotations. We propose a unified annotation integration using USD, with application-specific USD flavors. We identify challenges in utilizing holistic real-world scan datasets and present mitigation strategies. The efficacy of our approach is demonstrated through two downstream applications: LLM-based scene editing, enabling effective LLM understanding and adaptation of the data (80% success), and robotic simulation, achieving an 87% success rate in policy learning.

Page Count
5 pages

Category
Computer Science:
CV and Pattern Recognition