ReScene4D: Temporally Consistent Semantic Instance Segmentation of Evolving Indoor 3D Scenes
By: Emily Steiner , Jianhao Zheng , Henry Howard-Jenkins and more
Potential Business Impact:
Tracks objects moving in rooms over time.
Indoor environments evolve as objects move, appear, or disappear. Capturing these dynamics requires maintaining temporally consistent instance identities across intermittently captured 3D scans, even when changes are unobserved. We introduce and formalize the task of temporally sparse 4D indoor semantic instance segmentation (SIS), which jointly segments, identifies, and temporally associates object instances. This setting poses a challenge for existing 3DSIS methods, which require a discrete matching step due to their lack of temporal reasoning, and for 4D LiDAR approaches, which perform poorly due to their reliance on high-frequency temporal measurements that are uncommon in the longer-horizon evolution of indoor environments. We propose ReScene4D, a novel method that adapts 3DSIS architectures for 4DSIS without needing dense observations. It explores strategies to share information across observations, demonstrating that this shared context not only enables consistent instance tracking but also improves standard 3DSIS quality. To evaluate this task, we define a new metric, t-mAP, that extends mAP to reward temporal identity consistency. ReScene4D achieves state-of-the-art performance on the 3RScan dataset, establishing a new benchmark for understanding evolving indoor scenes.
Similar Papers
Online Segment Any 3D Thing as Instance Tracking
CV and Pattern Recognition
Helps robots understand moving objects in 3D.
3D Scene Change Modeling With Consistent Multi-View Aggregation
CV and Pattern Recognition
Finds changes in 3D scenes by comparing views.
Consistent Instance Field for Dynamic Scene Understanding
CV and Pattern Recognition
Lets computers understand moving objects in videos.