Geodiffussr: Generative Terrain Texturing with Elevation Fidelity
By: Tai Inui, Alexander Matsumura, Edgar Simo-Serra
Potential Business Impact:
Creates realistic landscapes from text and height maps.
Large-scale terrain generation remains a labor-intensive task in computer graphics. We introduce Geodiffussr, a flow-matching pipeline that synthesizes text-guided texture maps while strictly adhering to a supplied Digital Elevation Map (DEM). The core mechanism is multi-scale content aggregation (MCA): DEM features from a pretrained encoder are injected into UNet blocks at multiple resolutions to enforce global-to-local elevation consistency. Compared with a non-MCA baseline, MCA markedly improves visual fidelity and strengthens height-appearance coupling (FID $\downarrow$ 49.16%, LPIPS $\downarrow$ 32.33%, $Δ$dCor $\downarrow$ to 0.0016). To train and evaluate Geodiffussr, we assemble a globally distributed, biome- and climate-stratified corpus of triplets pairing SRTM-derived DEMs with Sentinel-2 imagery and vision-grounded natural-language captions that describe visible land cover. We position Geodiffussr as a strong baseline and step toward controllable 2.5D landscape generation for coarse-scale ideation and previz, complementary to physically based terrain and ecosystem simulators.
Similar Papers
MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data
Graphics
Creates realistic landscapes from text descriptions.
Digital Elevation Model Estimation from RGB Satellite Imagery using Generative Deep Learning
Image and Video Processing
Makes maps of land height from regular satellite pictures.
GrounDiff: Diffusion-Based Ground Surface Generation from Digital Surface Models
CV and Pattern Recognition
Makes maps of the ground from blurry pictures.