Ranking Vectors Clustering: Theory and Applications
By: Ali Fattahi , Ali Eshragh , Babak Aslani and more
Potential Business Impact:
Groups similar ranked lists to make better choices.
We study the problem of clustering ranking vectors, where each vector represents preferences as an ordered list of distinct integers. Specifically, we focus on the k-centroids ranking vectors clustering problem (KRC), which aims to partition a set of ranking vectors into k clusters and identify the centroid of each cluster. Unlike classical k-means clustering (KMC), KRC constrains both the observations and centroids to be ranking vectors. We establish the NP-hardness of KRC and characterize its feasible set. For the single-cluster case, we derive a closed-form analytical solution for the optimal centroid, which can be computed in linear time. To address the computational challenges of KRC, we develop an efficient approximation algorithm, KRCA, which iteratively refines initial solutions from KMC, referred to as the baseline solution. Additionally, we introduce a branch-and-bound (BnB) algorithm for efficient cluster reconstruction within KRCA, leveraging a decision tree framework to reduce computational time while incorporating a controlling parameter to balance solution quality and efficiency. We establish theoretical error bounds for KRCA and BnB. Through extensive numerical experiments on synthetic and real-world datasets, we demonstrate that KRCA consistently outperforms baseline solutions, delivering significant improvements in solution quality with fast computational times. This work highlights the practical significance of KRC for personalization and large-scale decision making, offering methodological advancements and insights that can be built upon in future studies.
Similar Papers
Clustering Approaches for Mixed-Type Data: A Comparative Study
Machine Learning (Stat)
Finds patterns in mixed-type data.
Dimension-Free Parameterized Approximation Schemes for Hybrid Clustering
Data Structures and Algorithms
Helps group data better, even in complex shapes.
Statistical and computational challenges in ranking
Statistics Theory
Ranks people by how good their answers are.