Unveiling 3D Ocean Biogeochemical Provinces in the North Atlantic: A Systematic Comparison and Validation of Clustering Methods
By: Yvonne Jenniges , Maike Sonnewald , Sebastian Maneth and more
Potential Business Impact:
Maps ocean areas more accurately and reliably.
Defining ocean regions and water masses helps to understand marine processes and can serve downstream tasks such as defining marine protected areas. However, such definitions often result from subjective decisions potentially producing misleading, unreproducible outcomes. Here, the aim was to objectively define regions of the North Atlantic through systematic comparison of clustering methods within the Native Emergent Manifold Interrogation (NEMI) framework (Sonnewald, 2023). About 300 million measured salinity, temperature, and oxygen, nitrate, phosphate and silicate concentration values served as input for various clustering methods (k-Means, agglomerative Ward, and Density-Based Spatial Clustering of Applications with Noise (DBSCAN)). Uniform Manifold Approximation and Projection (UMAP) emphasised (dis-)similarities in the data while reducing dimensionality. Based on systematic validation of clustering methods and their hyperparameters using internal, external and relative validation techniques, results showed that UMAP-DBSCAN best represented the data. Strikingly, internal validation metrics proved systematically unreliable for comparing clustering methods. To address stochastic variability, 100 UMAP-DBSCAN clustering runs were conducted and aggregated following NEMI, yielding a final set of 321 clusters. Reproducibility was evaluated via ensemble overlap ($88.81\pm1.8\%$) and mean grid cell-wise uncertainty ($15.49\pm20\%$). Case studies of the Mediterranean Sea, deep Atlantic waters and Labrador Sea showed strong agreement with common water mass definitions. This study revealed a more detailed regionalisation compared to previous concepts such as the Longhurst provinces through systematic clustering method comparison. The applied method is objective, efficient and reproducible and will support future research on biogeochemical differences and changes in oceanic regions.
Similar Papers
Defining 3-dimensional marine provinces with phytoplankton compositions
Applications
Maps ocean life in 3D, not just flat.
Online Clustering of Seafloor Imagery for Interpretation during Long-Term AUV Operations
CV and Pattern Recognition
Helps underwater robots understand ocean floor pictures.
Estimating carbon pools in the shelf sea environment: reanalysis or model-informed machine learning?
Quantitative Methods
Helps track ocean carbon better, cheaper.