Adaptive tail index estimation: minimal assumptions and non-asymptotic guarantees
By: Johannes Lederer, Anne Sabourin, Mahsa Taheri
Potential Business Impact:
Helps find rare, extreme events in data.
A notoriously difficult challenge in extreme value theory is the choice of the number $k\ll n$, where $n$ is the total sample size, of extreme data points to consider for inference of tail quantities. Existing theoretical guarantees for adaptive methods typically require second-order assumptions or von Mises assumptions that are difficult to verify and often come with tuning parameters that are challenging to calibrate. This paper revisits the problem of adaptive selection of $k$ for the Hill estimator. Our goal is not an `optimal' $k$ but one that is `good enough', in the sense that we strive for non-asymptotic guarantees that might be sub-optimal but are explicit and require minimal conditions. We propose a transparent adaptive rule that does not require preliminary calibration of constants, inspired by `adaptive validation' developed in high-dimensional statistics. A key feature of our approach is the consideration of a grid for $k$ of size $ \ll n $, which aligns with common practice among practitioners but has remained unexplored in theoretical analysis. Our rule only involves an explicit expression of a variance-type term; in particular, it does not require controlling or estimating a biasterm. Our theoretical analysis is valid for all heavy-tailed distributions, specifically for all regularly varying survival functions. Furthermore, when von Mises conditions hold, our method achieves `almost' minimax optimality with a rate of $\sqrt{\log \log n}~ n^{-|\rho|/(1+2|\rho|)}$ when the grid size is of order $\log n$, in contrast to the $ (\log \log (n)/n)^{|\rho|/(1+2|\rho|)} $ rate in existing work. Our simulations show that our approach performs particularly well for ill-behaved distributions.
Similar Papers
Diffusion Models with Heavy-Tailed Targets: Score Estimation and Sampling Guarantees
Statistics Theory
Makes AI create realistic images from messy data.
Robust Tail Index Estimation under Random Censoring via Minimum Density Power Divergence
Statistics Theory
Finds rare events in tricky data.
Adaptive estimation in regression models for weakly dependent data and explanatory variable with known density
Statistics Theory
Finds patterns in messy data for better predictions.