On the Information Processing of One-Dimensional Wasserstein Distances with Finite Samples
By: Cheongjae Jang , Jonghyun Won , Soyeon Jun and more
Potential Business Impact:
Finds important differences in data patterns.
Leveraging the Wasserstein distance -- a summation of sample-wise transport distances in data space -- is advantageous in many applications for measuring support differences between two underlying density functions. However, when supports significantly overlap while densities exhibit substantial pointwise differences, it remains unclear whether and how this transport information can accurately identify these differences, particularly their analytic characterization in finite-sample settings. We address this issue by conducting an analysis of the information processing capabilities of the one-dimensional Wasserstein distance with finite samples. By utilizing the Poisson process and isolating the rate factor, we demonstrate the capability of capturing the pointwise density difference with Wasserstein distances and how this information harmonizes with support differences. The analyzed properties are confirmed using neural spike train decoding and amino acid contact frequency data. The results reveal that the one-dimensional Wasserstein distance highlights meaningful density differences related to both rate and support.
Similar Papers
Fast Wasserstein rates for estimating probability distributions of probabilistic graphical models
Statistics Theory
Helps computers learn from less information.
Convergence of Deterministic and Stochastic Diffusion-Model Samplers: A Simple Analysis in Wasserstein Distance
Machine Learning (CS)
Makes AI create better pictures by fixing math.
Wasserstein Distances Made Explainable: Insights into Dataset Shifts and Transport Phenomena
Machine Learning (CS)
Explains why data is different by finding key parts.