Distribution Estimation with Side Information
By: Haricharan Balasundaram, Andrew Thangaraj
Potential Business Impact:
Uses word meanings to guess data better.
We consider the classical problem of discrete distribution estimation using i.i.d. samples in a novel scenario where additional side information is available on the distribution. In large alphabet datasets such as text corpora, such side information arises naturally through word semantics/similarities that can be inferred by closeness of vector word embeddings, for instance. We consider two specific models for side information--a local model where the unknown distribution is in the neighborhood of a known distribution, and a partial ordering model where the alphabet is partitioned into known higher and lower probability sets. In both models, we theoretically characterize the improvement in a suitable squared-error risk because of the available side information. Simulations over natural language and synthetic data illustrate these gains.
Similar Papers
One-Bit Distributed Mean Estimation with Unknown Variance
Information Theory
Helps computers guess averages with tiny messages.
Simple and Sharp Generalization Bounds via Lifting
Statistics Theory
Makes computer learning more accurate and faster.
Partitioning the Sample Space for a More Precise Shannon Entropy Estimation
Machine Learning (CS)
Helps guess hidden information from limited data.