Earth Observation Foundation Model PhilEO: Pretraining on the MajorTOM and FastTOM Datasets
By: Nikolaos Dionelis , Riccardo Musto , Jente Bosmans and more
Potential Business Impact:
Helps satellites understand Earth better with less work.
Today, Earth Observation (EO) satellites generate massive volumes of data. To fully exploit this, it is essential to pretrain EO Foundation Models (FMs) on large unlabeled datasets, enabling efficient fine-tuning for downstream tasks with minimal labeled data. In this paper, we study scaling-up FMs: we train our models on the pretraining dataset MajorTOM 23TB which includes all regions, and the performance on average is competitive versus models pretrained on more specialized datasets which are substantially smaller and include only land. The additional data of oceans and ice do not decrease the performance on land-focused downstream tasks. These results indicate that large FMs trained on global datasets for a wider variety of downstream tasks can be useful for downstream applications that only require a subset of the information included in their training. The second contribution is the exploration of U-Net Convolutional Neural Network (CNN), Vision Transformers (ViT), and Mamba State-Space Models (SSM) as FMs. U-Net captures local correlations amongst pixels, while ViT and Mamba capture local and distant correlations. We develop various models using different architectures, including U-Net, ViT, and Mamba, and different number of parameters. We evaluate the FLoating-point OPerations (FLOPs) needed by the models. We fine-tune on the PhilEO Bench for different downstream tasks: roads, buildings, and land cover. For most n-shots for roads and buildings, U-Net 200M-2T outperforms the other models. Using Mamba, we achieve comparable results on the downstream tasks, with less computational expenses. We also compare with the recent FM TerraMind which we evaluate on PhilEO Bench.
Similar Papers
Towards a Unified Copernicus Foundation Model for Earth Vision
CV and Pattern Recognition
Lets satellites understand Earth better, from land to air.
TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation
CV and Pattern Recognition
Helps satellites understand Earth better from space.
First On-Orbit Demonstration of a Geospatial Foundation Model
Machine Learning (CS)
Shrinks big AI to fit on space cameras.