Score: 0

Subspace Clustering of Subspaces: Unifying Canonical Correlation Analysis and Subspace Clustering

Published: September 23, 2025 | arXiv ID: 2509.18653v1

By: Paris A. Karakasis, Nicholas D. Sidiropoulos

Potential Business Impact:

Groups similar data shapes, even when messy.

Business Areas:
Big Data Data and Analytics

We introduce a novel framework for clustering a collection of tall matrices based on their column spaces, a problem we term Subspace Clustering of Subspaces (SCoS). Unlike traditional subspace clustering methods that assume vectorized data, our formulation directly models each data sample as a matrix and clusters them according to their underlying subspaces. We establish conceptual links to Subspace Clustering and Generalized Canonical Correlation Analysis (GCCA), and clarify key differences that arise in this more general setting. Our approach is based on a Block Term Decomposition (BTD) of a third-order tensor constructed from the input matrices, enabling joint estimation of cluster memberships and partially shared subspaces. We provide the first identifiability results for this formulation and propose scalable optimization algorithms tailored to large datasets. Experiments on real-world hyperspectral imaging datasets demonstrate that our method achieves superior clustering accuracy and robustness, especially under high noise and interference, compared to existing subspace clustering techniques. These results highlight the potential of the proposed framework in challenging high-dimensional applications where structure exists beyond individual data vectors.

Country of Origin
🇺🇸 United States

Page Count
13 pages

Category
Computer Science:
Machine Learning (CS)