Score: 0

Scalable unsupervised feature selection via weight stability

Published: June 6, 2025 | arXiv ID: 2506.06114v3

By: Xudong Zhang, Renato Cordeiro de Amorim

Potential Business Impact:

Finds important patterns in messy data.

Business Areas:

A/B Testing Data and Analytics

Unsupervised feature selection is critical for improving clustering performance in high-dimensional data, where irrelevant features can obscure meaningful structure. In this work, we introduce the Minkowski weighted $k$-means++, a novel initialisation strategy for the Minkowski Weighted $k$-means. Our initialisation selects centroids probabilistically using feature relevance estimates derived from the data itself. Building on this, we propose two new feature selection algorithms, FS-MWK++, which aggregates feature weights across a range of Minkowski exponents to identify stable and informative features, and SFS-MWK++, a scalable variant based on subsampling. We support our approach with a theoretical guarantee under mild assumptions and extensive experiments showing that our methods consistently outperform existing alternatives. Our software can be found at https://github.com/xzhang4-ops1/FSMWK.

Class-Level Feature Selection Method Using Feature Weighted Growing Self-Organising Maps

Machine Learning (CS)

Finds the best clues for each specific problem.

14 Mar 2025 0

86%

Shapley-Inspired Feature Weighting in $k$-means with No Additional Hyperparameters

Machine Learning (CS)

Finds important patterns by ignoring bad data.

11 Aug 2025 1

85%

Semi-Supervised Federated Multi-Label Feature Selection with Fuzzy Information Measures

Machine Learning (CS)

Helps computers learn from unlabeled data.

21 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇬🇧 United Kingdom

Page Count

26 pages

Scalable unsupervised feature selection via weight stability

Finds important patterns in messy data.

Technical Abstract

Class-Level Feature Selection Method Using Feature Weighted Growing Self-Organising Maps

Shapley-Inspired Feature Weighting in $k$-means with No Additional Hyperparameters

Semi-Supervised Federated Multi-Label Feature Selection with Fuzzy Information Measures