Score: 0

A Semiparametric Gaussian Mixture Model with Spatial Dependence and Its Application to Whole-Slide Image Clustering Analysis

Published: October 18, 2025 | arXiv ID: 2510.16421v1

By: Baichen Yu, Jin Liu, Hansheng Wang

Potential Business Impact:

Finds cancer in pictures by grouping similar spots.

Business Areas:
Geospatial Data and Analytics, Navigation and Mapping

We develop here a semiparametric Gaussian mixture model (SGMM) for unsupervised learning with valuable spatial information taken into consideration. Specifically, we assume for each instance a random location. Then, conditional on this random location, we assume for the feature vector a standard Gaussian mixture model (GMM). The proposed SGMM allows the mixing probability to be nonparametrically related to the spatial location. Compared with a classical GMM, SGMM is considerably more flexible and allows the instances from the same class to be spatially clustered. To estimate the SGMM, novel EM algorithms are developed and rigorous asymptotic theories are established. Extensive numerical simulations are conducted to demonstrate our finite sample performance. For a real application, we apply our SGMM method to the CAMELYON16 dataset of whole-slide images (WSIs) for breast cancer detection. The SGMM method demonstrates outstanding clustering performance.

Page Count
27 pages

Category
Statistics:
Methodology