Integrating Prior Observations for Incremental 3D Scene Graph Prediction
By: Marian Renz, Felix Igelbrink, Martin Atzmueller
Potential Business Impact:
Helps robots understand messy places better.
3D semantic scene graphs (3DSSG) provide compact structured representations of environments by explicitly modeling objects, attributes, and relationships. While 3DSSGs have shown promise in robotics and embodied AI, many existing methods rely mainly on sensor data, not integrating further information from semantically rich environments. Additionally, most methods assume access to complete scene reconstructions, limiting their applicability in real-world, incremental settings. This paper introduces a novel heterogeneous graph model for incremental 3DSSG prediction that integrates additional, multi-modal information, such as prior observations, directly into the message-passing process. Utilizing multiple layers, the model flexibly incorporates global and local scene representations without requiring specialized modules or full scene reconstructions. We evaluate our approach on the 3DSSG dataset, showing that GNNs enriched with multi-modal information such as semantic embeddings (e.g., CLIP) and prior observations offer a scalable and generalizable solution for complex, real-world environments. The full source code of the presented architecture will be made available at https://github.com/m4renz/incremental-scene-graph-prediction.
Similar Papers
Statistical Confidence Rescoring for Robust 3D Scene Graph Generation from Multi-View Images
CV and Pattern Recognition
Helps computers understand 3D scenes from pictures.
Structured Interfaces for Automated Reasoning with 3D Scene Graphs
CV and Pattern Recognition
Robots understand spoken words by seeing objects.
GeoSceneGraph: Geometric Scene Graph Diffusion Model for Text-guided 3D Indoor Scene Synthesis
CV and Pattern Recognition
Creates realistic 3D rooms from your words.