Uniform Mean Estimation for Heavy-Tailed Distributions via Median-of-Means
By: Mikael Møller Høgsgaard, Andrea Paudice
Potential Business Impact:
Finds averages in tricky data better.
The Median of Means (MoM) is a mean estimator that has gained popularity in the context of heavy-tailed data. In this work, we analyze its performance in the task of simultaneously estimating the mean of each function in a class $\mathcal{F}$ when the data distribution possesses only the first $p$ moments for $p \in (1,2]$. We prove a new sample complexity bound using a novel symmetrization technique that may be of independent interest. Additionally, we present applications of our result to $k$-means clustering with unbounded inputs and linear regression with general losses, improving upon existing works.
Similar Papers
On the Optimality of the Median-of-Means Estimator under Adversarial Contamination
Machine Learning (Stat)
Protects computer guesses from bad data.
Convex Clustering Redefined: Robust Learning with the Median of Means Estimator
Machine Learning (Stat)
Finds hidden groups in messy data without guessing.
General Form Moment-based Estimator of Weibull, Gamma, and Log-normal Distributions
Methodology
Finds hidden patterns in numbers more easily.