Unsupervised Learning of Density Estimates with Topological Optimization
By: Suina Tanweer, Firas A. Khasawneh
Potential Business Impact:
Finds best settings for computer learning.
Kernel density estimation is a key component of a wide variety of algorithms in machine learning, Bayesian inference, stochastic dynamics and signal processing. However, the unsupervised density estimation technique requires tuning a crucial hyperparameter: the kernel bandwidth. The choice of bandwidth is critical as it controls the bias-variance trade-off by over- or under-smoothing the topological features. Topological data analysis provides methods to mathematically quantify topological characteristics, such as connected components, loops, voids et cetera, even in high dimensions where visualization of density estimates is impossible. In this paper, we propose an unsupervised learning approach using a topology-based loss function for the automated and unsupervised selection of the optimal bandwidth and benchmark it against classical techniques -- demonstrating its potential across different dimensions.
Similar Papers
Density Estimation from Aggregated Data with Integrated Auxiliary Information: Estimating Population Densities with Geospatial Data
Applications
Makes maps more accurate with extra clues.
Concentration bounds for intrinsic dimension estimation using Gaussian kernels
Statistics Theory
Helps computers guess how complex data is.
Global Optimization of Stochastic Black-Box Functions with Arbitrary Noise Distributions using Wilson Score Kernel Density Estimation
Machine Learning (Stat)
Finds best robot designs faster, even with guesswork.