Score: 0

Topological Metric for Unsupervised Embedding Quality Evaluation

Published: December 17, 2025 | arXiv ID: 2512.15285v1

By: Aleksei Shestov , Anton Klenitskiy , Daria Denisova and more

Modern representation learning increasingly relies on unsupervised and self-supervised methods trained on large-scale unlabeled data. While these approaches achieve impressive generalization across tasks and domains, evaluating embedding quality without labels remains an open challenge. In this work, we propose Persistence, a topology-aware metric based on persistent homology that quantifies the geometric structure and topological richness of embedding spaces in a fully unsupervised manner. Unlike metrics that assume linear separability or rely on covariance structure, Persistence captures global and multi-scale organization. Empirical results across diverse domains show that Persistence consistently achieves top-tier correlations with downstream performance, outperforming existing unsupervised metrics and enabling reliable model and hyperparameter selection.

Category
Computer Science:
Machine Learning (CS)