GrounDiff: Diffusion-Based Ground Surface Generation from Digital Surface Models
By: Oussema Dhaouadi , Johannes Meier , Jacques Kaiser and more
Potential Business Impact:
Makes maps of the ground from blurry pictures.
Digital Terrain Models (DTMs) represent the bare-earth elevation and are important in numerous geospatial applications. Such data models cannot be directly measured by sensors and are typically generated from Digital Surface Models (DSMs) derived from LiDAR or photogrammetry. Traditional filtering approaches rely on manually tuned parameters, while learning-based methods require well-designed architectures, often combined with post-processing. To address these challenges, we introduce Ground Diffusion (GrounDiff), the first diffusion-based framework that iteratively removes non-ground structures by formulating the problem as a denoising task. We incorporate a gated design with confidence-guided generation that enables selective filtering. To increase scalability, we further propose Prior-Guided Stitching (PrioStitch), which employs a downsampled global prior automatically generated using GrounDiff to guide local high-resolution predictions. We evaluate our method on the DSM-to-DTM translation task across diverse datasets, showing that GrounDiff consistently outperforms deep learning-based state-of-the-art methods, reducing RMSE by up to 93% on ALS2DTM and up to 47% on USGS benchmarks. In the task of road reconstruction, which requires both high precision and smoothness, our method achieves up to 81% lower distance error compared to specialized techniques on the GeRoD benchmark, while maintaining competitive surface smoothness using only DSM inputs, without task-specific optimization. Our variant for road reconstruction, GrounDiff+, is specifically designed to produce even smoother surfaces, further surpassing state-of-the-art methods. The project page is available at https://deepscenario.github.io/GrounDiff/.
Similar Papers
GeoDiff: Geometry-Guided Diffusion for Metric Depth Estimation
CV and Pattern Recognition
Makes single-camera pictures show true distances.
Geodiffussr: Generative Terrain Texturing with Elevation Fidelity
Graphics
Creates realistic landscapes from text and height maps.
DiG: Differential Grounding for Enhancing Fine-Grained Perception in Multimodal Large Language Model
CV and Pattern Recognition
Teaches computers to spot tiny differences in pictures.