Score: 0

Distribution Estimation with Side Information

Published: January 13, 2026 | arXiv ID: 2601.08535v1

By: Haricharan Balasundaram, Andrew Thangaraj

Potential Business Impact:

Uses word meanings to guess data better.

Business Areas:

A/B Testing Data and Analytics

We consider the classical problem of discrete distribution estimation using i.i.d. samples in a novel scenario where additional side information is available on the distribution. In large alphabet datasets such as text corpora, such side information arises naturally through word semantics/similarities that can be inferred by closeness of vector word embeddings, for instance. We consider two specific models for side information--a local model where the unknown distribution is in the neighborhood of a known distribution, and a partial ordering model where the alphabet is partitioned into known higher and lower probability sets. In both models, we theoretically characterize the improvement in a suitable squared-error risk because of the available side information. Simulations over natural language and synthetic data illustrate these gains.