Gamma-from-Mono: Road-Relative, Metric, Self-Supervised Monocular Geometry for Vehicular Applications
By: Gasser Elazab , Maximilian Jansen , Michael Unterreiner and more
Potential Business Impact:
Helps cars see bumps and slopes on roads.
Accurate perception of the vehicle's 3D surroundings, including fine-scale road geometry, such as bumps, slopes, and surface irregularities, is essential for safe and comfortable vehicle control. However, conventional monocular depth estimation often oversmooths these features, losing critical information for motion planning and stability. To address this, we introduce Gamma-from-Mono (GfM), a lightweight monocular geometry estimation method that resolves the projective ambiguity in single-camera reconstruction by decoupling global and local structure. GfM predicts a dominant road surface plane together with residual variations expressed by gamma, a dimensionless measure of vertical deviation from the plane, defined as the ratio of a point's height above it to its depth from the camera, and grounded in established planar parallax geometry. With only the camera's height above ground, this representation deterministically recovers metric depth via a closed form, avoiding full extrinsic calibration and naturally prioritizing near-road detail. Its physically interpretable formulation makes it well suited for self-supervised learning, eliminating the need for large annotated datasets. Evaluated on KITTI and the Road Surface Reconstruction Dataset (RSRD), GfM achieves state-of-the-art near-field accuracy in both depth and gamma estimation while maintaining competitive global depth performance. Our lightweight 8.88M-parameter model adapts robustly across diverse camera setups and, to our knowledge, is the first self-supervised monocular approach evaluated on RSRD.
Similar Papers
Monocular Depth Estimation with Global-Aware Discretization and Local Context Modeling
CV and Pattern Recognition
Helps computers guess how far away things are.
Gaussian Alignment for Relative Camera Pose Estimation via Single-View Reconstruction
CV and Pattern Recognition
Helps computers understand 3D space from pictures.
GeoDiff: Geometry-Guided Diffusion for Metric Depth Estimation
CV and Pattern Recognition
Makes single-camera pictures show true distances.