Zero-Shot Metric Depth Estimation via Monocular Visual-Inertial Rescaling for Autonomous Aerial Navigation
By: Steven Yang , Xiaoyu Tian , Kshitij Goel and more
Potential Business Impact:
Helps drones see how far things are.
This paper presents a methodology to predict metric depth from monocular RGB images and an inertial measurement unit (IMU). To enable collision avoidance during autonomous flight, prior works either leverage heavy sensors (e.g., LiDARs or stereo cameras) or data-intensive and domain-specific fine-tuning of monocular metric depth estimation methods. In contrast, we propose several lightweight zero-shot rescaling strategies to obtain metric depth from relative depth estimates via the sparse 3D feature map created using a visual-inertial navigation system. These strategies are compared for their accuracy in diverse simulation environments. The best performing approach, which leverages monotonic spline fitting, is deployed in the real-world on a compute-constrained quadrotor. We obtain on-board metric depth estimates at 15 Hz and demonstrate successful collision avoidance after integrating the proposed method with a motion primitives-based planner.
Similar Papers
VIMD: Monocular Visual-Inertial Motion and Depth Estimation
CV and Pattern Recognition
Helps robots see in 3D with just one camera.
A Novel Solution for Drone Photogrammetry with Low-overlap Aerial Images using Monocular Depth Estimation
CV and Pattern Recognition
Maps places better with fewer pictures.
Self-Supervised Learning to Fly using Efficient Semantic Segmentation and Metric Depth Estimation for Low-Cost Autonomous UAVs
CV and Pattern Recognition
Drones fly themselves using only cameras.