Confidence Sets for Multidimensional Scaling
By: Siddharth Vishwanath, Ery Arias-Castro
Potential Business Impact:
Finds hidden patterns in messy data.
We develop a formal statistical framework for classical multidimensional scaling (CMDS) applied to noisy dissimilarity data. We establish distributional convergence results for the embeddings produced by CMDS for various noise models, which enable the construction of \emph{bona~fide} uniform confidence sets for the latent configuration, up to rigid transformations. We further propose bootstrap procedures for constructing these confidence sets and provide theoretical guarantees for their validity. We find that the multiplier bootstrap adapts automatically to heteroscedastic noise such as multiplicative noise, while the empirical bootstrap seems to require homoscedasticity. Either form of bootstrap, when valid, is shown to substantially improve finite-sample accuracy. The empirical performance of the proposed methods is demonstrated through numerical experiments.
Similar Papers
Bootstrap Consistency for Empirical Likelihood in Density Ratio Models
Statistics Theory
Helps check if math guesses are right.
Statistical Inference for Manifold Similarity and Alignability across Noisy High-Dimensional Datasets
Statistics Theory
Compares complex data by looking at its hidden shapes.
High-Dimensional BWDM: A Robust Nonparametric Clustering Validation Index for Large-Scale Data
Machine Learning (Stat)
Finds best groups in messy, big data.