Score: 0

Extreme-PLS with missing data under weak dependence

Published: November 14, 2025 | arXiv ID: 2511.11338v1

By: Stéphane Girard, Cambyse Pakzad

Potential Business Impact:

Finds important patterns in messy, incomplete data.

Business Areas:
A/B Testing Data and Analytics

This paper develops a theoretical framework for Extreme Partial Least Squares (EPLS) dimension reduction in the presence of missing data and weak temporal dependence. Building upon the recent EPLS methodology for modeling extremal dependence between a response variable and high-dimensional covariates, we extend the approach to more realistic data settings where both serial correlation and missing-ness occur. Specifically, we consider a single-index inverse regression model under heavy-tailed conditions and introduce a Missing-at-Random (MAR) mechanism acting on the covariates, whose probability depends on the extremeness of the response. The asymptotic behavior of the proposed estimator is established within an alpha-mixing framework, leading to consistency results under regularly varying tails. Extensive Monte-Carlo experiments covering eleven dependence schemes (including ARMA, GARCH, and nonlinear ESTAR processes) demonstrate that the method performs robustly across a wide range of heavy-tailed and dependent scenarios, even when substantial portions of data are missing. A real-world application to environmental data further confirms the method's capacity to recover meaningful tail directions.

Country of Origin
🇫🇷 France

Page Count
45 pages

Category
Statistics:
Methodology