Monocular Depth Estimation with Global-Aware Discretization and Local Context Modeling
By: Heng Wu, Qian Zhang, Guixu Zhang
Potential Business Impact:
Helps computers guess how far away things are.
Accurate monocular depth estimation remains a challenging problem due to the inherent ambiguity that stems from the ill-posed nature of recovering 3D structure from a single view, where multiple plausible depth configurations can produce identical 2D projections. In this paper, we present a novel depth estimation method that combines both local and global cues to improve prediction accuracy. Specifically, we propose the Gated Large Kernel Attention Module (GLKAM) to effectively capture multi-scale local structural information by leveraging large kernel convolutions with a gated mechanism. To further enhance the global perception of the network, we introduce the Global Bin Prediction Module (GBPM), which estimates the global distribution of depth bins and provides structural guidance for depth regression. Extensive experiments on the NYU-V2 and KITTI dataset demonstrate that our method achieves competitive performance and outperforms existing approaches, validating the effectiveness of each proposed component.
Similar Papers
Fine-Grained Cross-View Localization via Local Feature Matching and Monocular Depth Priors
CV and Pattern Recognition
Finds your location from a picture.
OmniDepth: Bridging Monocular and Stereo Reasoning with Latent Alignment
CV and Pattern Recognition
Makes 3D pictures more accurate, even on shiny things.
UM-Depth : Uncertainty Masked Self-Supervised Monocular Depth Estimation with Visual Odometry
CV and Pattern Recognition
Makes self-driving cars see better in tricky spots.