Motion Aware ViT-based Framework for Monocular 6-DoF Spacecraft Pose Estimation
By: Jose Sosa , Dan Pineau , Arunkumar Rathinam and more
Potential Business Impact:
Helps robots know where they are in space.
Monocular 6-DoF pose estimation plays an important role in multiple spacecraft missions. Most existing pose estimation approaches rely on single images with static keypoint localisation, failing to exploit valuable temporal information inherent to space operations. In this work, we adapt a deep learning framework from human pose estimation to the spacecraft pose estimation domain that integrates motion-aware heatmaps and optical flow to capture motion dynamics. Our approach combines image features from a Vision Transformer (ViT) encoder with motion cues from a pre-trained optical flow model to localise 2D keypoints. Using the estimates, a Perspective-n-Point (PnP) solver recovers 6-DoF poses from known 2D-3D correspondences. We train and evaluate our method on the SPADES-RGB dataset and further assess its generalisation on real and synthetic data from the SPARK-2024 dataset. Overall, our approach demonstrates improved performance over single-image baselines in both 2D keypoint localisation and 6-DoF pose estimation. Furthermore, it shows promising generalisation capabilities when testing on different data distributions.
Similar Papers
FastPose-ViT: A Vision Transformer for Real-Time Spacecraft Pose Estimation
CV and Pattern Recognition
Helps robots in space know where they are.
NeRF-based Visualization of 3D Cues Supporting Data-Driven Spacecraft Pose Estimation
CV and Pattern Recognition
Shows how robots see and know where to go.
Stereo Event-based, 6-DOF Pose Tracking for Uncooperative Spacecraft
CV and Pattern Recognition
Helps robots in space see and grab broken satellites.