OmniMap: A General Mapping Framework Integrating Optics, Geometry, and Semantics
By: Yinan Deng , Yufeng Yue , Jianyu Dou and more
Potential Business Impact:
Robots see and understand the world perfectly.
Robotic systems demand accurate and comprehensive 3D environment perception, requiring simultaneous capture of photo-realistic appearance (optical), precise layout shape (geometric), and open-vocabulary scene understanding (semantic). Existing methods typically achieve only partial fulfillment of these requirements while exhibiting optical blurring, geometric irregularities, and semantic ambiguities. To address these challenges, we propose OmniMap. Overall, OmniMap represents the first online mapping framework that simultaneously captures optical, geometric, and semantic scene attributes while maintaining real-time performance and model compactness. At the architectural level, OmniMap employs a tightly coupled 3DGS-Voxel hybrid representation that combines fine-grained modeling with structural stability. At the implementation level, OmniMap identifies key challenges across different modalities and introduces several innovations: adaptive camera modeling for motion blur and exposure compensation, hybrid incremental representation with normal constraints, and probabilistic fusion for robust instance-level understanding. Extensive experiments show OmniMap's superior performance in rendering fidelity, geometric accuracy, and zero-shot semantic segmentation compared to state-of-the-art methods across diverse scenes. The framework's versatility is further evidenced through a variety of downstream applications, including multi-domain scene Q&A, interactive editing, perception-guided manipulation, and map-assisted navigation.
Similar Papers
Online Object-Level Semantic Mapping for Quadrupeds in Real-World Environments
Robotics
Robot learns and remembers objects in a room.
OV-MAP : Open-Vocabulary Zero-Shot 3D Instance Segmentation Map for Robots
CV and Pattern Recognition
Helps robots see and understand objects in 3D.
OpenMap: Instruction Grounding via Open-Vocabulary Visual-Language Mapping
Robotics
Lets robots follow spoken directions in real rooms