Score: 0

An Approach to Variable Clustering: K-means in Transposed Data and its Relationship with Principal Component Analysis

Published: November 30, 2025 | arXiv ID: 2512.00979v1

By: Victor Saquicela, Kenneth Palacio-Baus, Mario Chifla

Potential Business Impact:

Finds hidden patterns in data by grouping things.

Business Areas:

Big Data Data and Analytics

Principal Component Analysis (PCA) and K-means constitute fundamental techniques in multivariate analysis. Although they are frequently applied independently or sequentially to cluster observations, the relationship between them, especially when K-means is used to cluster variables rather than observations, has been scarcely explored. This study seeks to address this gap by proposing an innovative method that analyzes the relationship between clusters of variables obtained by applying K-means on transposed data and the principal components of PCA. Our approach involves applying PCA to the original data and K-means to the transposed data set, where the original variables are converted into observations. The contribution of each variable cluster to each principal component is then quantified using measures based on variable loadings. This process provides a tool to explore and understand the clustering of variables and how such clusters contribute to the principal dimensions of variation identified by PCA.

Highly robust factored principal component analysis for matrix-valued outlier accommodation and explainable detection via matrix minimum covariance determinant

Methodology

Finds bad data points in complex pictures.

30 Sep 2025 0

87%

Principal Component Analysis When n < p: Challenges and Solutions

Methodology

Makes computer analysis better with messy, complex data.

21 Mar 2025 0

87%

TimeCluster with PCA is Equivalent to Subspace Identification of Linear Dynamical Systems

Machine Learning (CS)

Finds patterns in long, changing data.

16 Sep 2025 0

View PDF Login to Bookmark

Country of Origin

🇪🇨 Ecuador

Page Count

8 pages

An Approach to Variable Clustering: K-means in Transposed Data and its Relationship with Principal Component Analysis

Finds hidden patterns in data by grouping things.

Technical Abstract

Highly robust factored principal component analysis for matrix-valued outlier accommodation and explainable detection via matrix minimum covariance determinant

Principal Component Analysis When n < p: Challenges and Solutions

TimeCluster with PCA is Equivalent to Subspace Identification of Linear Dynamical Systems