Score: 0

Monocular Depth Estimation with Global-Aware Discretization and Local Context Modeling

Published: August 5, 2025 | arXiv ID: 2508.03186v1

By: Heng Wu, Qian Zhang, Guixu Zhang

Potential Business Impact:

Helps computers guess how far away things are.

Accurate monocular depth estimation remains a challenging problem due to the inherent ambiguity that stems from the ill-posed nature of recovering 3D structure from a single view, where multiple plausible depth configurations can produce identical 2D projections. In this paper, we present a novel depth estimation method that combines both local and global cues to improve prediction accuracy. Specifically, we propose the Gated Large Kernel Attention Module (GLKAM) to effectively capture multi-scale local structural information by leveraging large kernel convolutions with a gated mechanism. To further enhance the global perception of the network, we introduce the Global Bin Prediction Module (GBPM), which estimates the global distribution of depth bins and provides structural guidance for depth regression. Extensive experiments on the NYU-V2 and KITTI dataset demonstrate that our method achieves competitive performance and outperforms existing approaches, validating the effectiveness of each proposed component.

Country of Origin
🇨🇳 China

Page Count
12 pages

Category
Computer Science:
CV and Pattern Recognition