Approximations for the number of maxima and near-maxima in independent data
By: Fraser Daly
Potential Business Impact:
Finds patterns in data using math tricks.
In the setting where we have $n$ independent observations of a random variable $X$, we derive explicit error bounds in total variation distance when approximating the number of observations equal to the maximum of the sample (in the case where $X$ is discrete) or the number of observations within a given distance of an order statistic of the sample (in the case where $X$ is absolutely continuous). The logarithmic and Poisson distributions are used as approximations in the discrete case, with proofs which include the development of Stein's method for a logarithmic target distribution. In the absolutely continuous case our approximations are by the negative binomial distribution, and are established by considering negative binomial approximation for mixed binomials. The cases where $X$ is geometric, Gumbel and uniform are used as illustrative examples.
Similar Papers
Maximal Inequalities for Independent Random Vectors
Probability
Finds better math rules for guessing unknown things.
Approximating the Total Variation Distance between Gaussians
Data Structures and Algorithms
Measures how different two "normal" data sets are.
Estimation of discrete distributions in relative entropy, and the deviations of the missing mass
Statistics Theory
Finds hidden patterns in data more accurately.