Extreme-PLS with missing data under weak dependence
By: Stéphane Girard, Cambyse Pakzad
Potential Business Impact:
Finds important patterns in messy, incomplete data.
This paper develops a theoretical framework for Extreme Partial Least Squares (EPLS) dimension reduction in the presence of missing data and weak temporal dependence. Building upon the recent EPLS methodology for modeling extremal dependence between a response variable and high-dimensional covariates, we extend the approach to more realistic data settings where both serial correlation and missing-ness occur. Specifically, we consider a single-index inverse regression model under heavy-tailed conditions and introduce a Missing-at-Random (MAR) mechanism acting on the covariates, whose probability depends on the extremeness of the response. The asymptotic behavior of the proposed estimator is established within an alpha-mixing framework, leading to consistency results under regularly varying tails. Extensive Monte-Carlo experiments covering eleven dependence schemes (including ARMA, GARCH, and nonlinear ESTAR processes) demonstrate that the method performs robustly across a wide range of heavy-tailed and dependent scenarios, even when substantial portions of data are missing. A real-world application to environmental data further confirms the method's capacity to recover meaningful tail directions.
Similar Papers
A PLS-Integrated LASSO Method with Application in Index Tracking
Machine Learning (Stat)
Makes predicting stock prices more accurate.
Parsimonious Factor Models for Asymmetric Dependence in Multivariate Extremes
Methodology
Predicts rare, extreme weather and financial events.
Parsimonious Factor Models for Asymmetric Dependence in Multivariate Extremes
Methodology
Predicts rare, extreme weather and financial events.