Score: 0

Model-Based Clustering of Functional Data Via Random Projection Ensembles

Published: December 1, 2025 | arXiv ID: 2512.01450v1

By: Matteo Mori, Laura Anderlucci

Potential Business Impact:

Groups similar data by looking at it many ways.

Business Areas:
A/B Testing Data and Analytics

Clustering functional data is a challenging task due to intrinsic infinite-dimensionality and the need for stable, data-adaptive partitioning. In this work, we propose a clustering framework based on Random Projections, which simultaneously performs dimensionality reduction and generates multiple stochastic representations of the original functions. Each projection is clustered independently, and the resulting partitions are then aggregated through an ensemble consensus procedure, enhancing robustness and mitigating the influence of any single projection. To focus on the most informative representations, projections are ranked according to clustering quality criteria, and only a selected subset is retained. In particular, we adopt Gaussian Mixture Models as base clusterers and employ the Kullback-Leibler divergence to order the random projections; these choices enable fast computation and eliminate the need to specify the number of clusters a priori. The performance of the proposed methodology is assessed through an extensive simulation study and two real-data applications, one from spectroscopy data for food authentication and one from log-periodograms of speech recording; the obtained results suggest that the proposal represents an effective tool for the clustering of functional data.

Country of Origin
🇮🇹 Italy

Page Count
25 pages

Category
Statistics:
Methodology