Score: 0

High Dimensional Gaussian and Bootstrap Approximations in Generalized Linear Models

Published: January 14, 2026 | arXiv ID: 2601.09925v1

By: Mayukh Choudhury, Debraj Das

Generalized Linear Models (GLMs) extend ordinary linear regression by linking the mean of the response variable to covariates through appropriate link functions. This paper investigates the asymptotic behavior of GLM estimators when the parameter dimension $d$ grows with the sample size $n$. In the first part, we establish Gaussian approximation results for the distribution of a properly centered and scaled GLM estimator uniformly over class of convex sets and Euclidean balls. Using high-dimensional results from Fang and Koike (2024) for the leading Bahadur term, bounding remainder terms as in He and Shao (2000), and applying Nazarov's (2003) Gaussian isoperimetric inequality, we show that Gaussian approximation holds when $d = o(n^{2/5})$ for convex sets and $d = o(n^{1/2})$ for Euclidean balls-the best possible rates matching those for high-dimensional sample means. We further extend these results to the bootstrap approximation when the covariance matrix is unknown. In the second part, when $d>>n$, a natural question is to answer whether all covariates are equally important. To answer that, we employ sparsity in GLM through the Lasso estimator. While Lasso is widely used for variable selection, it cannot achieve both Variable Selection Consistency (VSC) and $n^{1/2}$-consistency simultaneously (Lahiri, 2021). Under the regime ensuring VSC, we show that Gaussian approximation for the Lasso estimator fails. To overcome this, we propose a Perturbation Bootstrap (PB) approach and establish a Berry-Esseen type bound for its approximation uniformly over class of convex sets. Simulation studies confirm the strong finite-sample performance of the proposed method.

Unifiedly Efficient Inference on All-Dimensional Targets for Large-Scale GLMs

Methodology

Makes big data analysis faster and more accurate.

8 Nov 2025 0

88%

Asymptotics of Non-Convex Generalized Linear Models in High-Dimensions: A proof of the replica formula

Machine Learning (Stat)

Proves computer learning can solve harder problems.

27 Feb 2025 0

88%

Characterizing Finite-Dimensional Posterior Marginals in High-Dimensional GLMs via Leave-One-Out

Statistics Theory

Makes computer guesses better even with lots of data.

31 Dec 2025 1

View PDF Login to Bookmark

High Dimensional Gaussian and Bootstrap Approximations in Generalized Linear Models

Technical Abstract

Unifiedly Efficient Inference on All-Dimensional Targets for Large-Scale GLMs

Asymptotics of Non-Convex Generalized Linear Models in High-Dimensions: A proof of the replica formula

Characterizing Finite-Dimensional Posterior Marginals in High-Dimensional GLMs via Leave-One-Out