Discrete Sequential Barycenter Arrays: Representation, Approximation, and Modeling of Probability Measures
By: Alejandro Jara, Carlos Sing-Long
Potential Business Impact:
Creates smart computer models that follow rules.
Constructing flexible probability models that respect constraints on key functionals -- such as the mean -- is a fundamental problem in nonparametric statistics. Existing approaches lack systematic tools for enforcing such constraints while retaining full modeling flexibility. This paper introduces a new representation for univariate probability measures based on discrete sequential barycenter arrays (SBA). We study structural properties of SBA representations and establish new approximation results. In particular, we show that for any target distribution, its SBA-based discrete approximations converge in both the weak topology and in Wasserstein distances, and that the representation is exact for all distributions with finite discrete support. We further characterize a broad class of measures whose SBA partitions exhibit regularity and induce increasingly fine meshes, and we prove that this class is dense in standard probabilistic topologies. These theoretical results enable the construction of probability models that preserve prescribed values -- or full distributions -- of the mean while maintaining large support. As an application, we derive a mixture model for density estimation whose induced mixing distribution has a fixed or user-specified mean. The resulting framework provides a principled mechanism for incorporating mean constraints in nonparametric modeling while preserving strong approximation properties. The approach is illustrated using both simulated and real data.
Similar Papers
Model averaging in the space of probability distributions
Methodology
Combines data to predict risks better.
Robust Representation and Estimation of Barycenters and Modes of Probability Measures on Metric Spaces
Statistics Theory
Makes computer math for shapes more reliable.
Multimarginal Schrödinger Barycenter
Statistics Theory
Finds the average of complex data faster.