Assumption-lean Inference for Network-linked Data
By: Wei Li, Nilanjan Chakraborty, Robert Lunde
Potential Business Impact:
Helps understand how people connect online.
We consider statistical inference for network-linked regression problems, where covariates may include network summary statistics computed for each node. In settings involving network data, it is often natural to posit that latent variables govern connection probabilities in the graph. Since the presence of these latent features makes classical regression assumptions even less tenable, we propose an assumption-lean framework for linear regression with jointly exchangeable regression arrays. We establish an analog of the Aldous-Hoover representation for such arrays, which may be of independent interest. Moreover, we consider two different projection parameters as potential targets and establish conditions under which asymptotic normality and bootstrap consistency hold when commonly used network statistics, including local subgraph frequencies and spectral embeddings, are used as covariates. In the case of linear regression with local count statistics, we show that a bias-corrected estimator allows one to target a more natural inferential target under weaker sparsity conditions compared to the OLS estimator. Our inferential tools are illustrated using both simulated data and real data related to the academic climate of elementary schools.
Similar Papers
Weak Identification in Peer Effects Estimation
Statistics Theory
Fixes math models for how friends influence friends.
Robust High-Dimensional Covariate-Assisted Network Modeling
Methodology
Finds hidden patterns in connected data.
Estimation in linear models with clustered data
Econometrics
Helps understand how groups of people affect each other.