Scalable and Communication-Efficient Varying Coefficient Mixed Effect Models: Methodology, Theory, and Applications
By: Lida Chalangar Jalili Dehkharghani, Li-Hsiang Lin
Potential Business Impact:
Helps track where people move, even with bad internet.
Human migration exhibits complex spatiotemporal dependence driven by environmental and socioeconomic forces. Modeling such patterns at scale-often across multiple administrative or institutional boundaries-requires statistically efficient methods that remain robust under limited communication, i.e., when transmitting raw data or large design matrices across distributed nodes is costly or restricted. This paper develops a communication-efficient inference framework for Varying Coefficient Mixed Models (VCMMs) that accommodates many input variables in the mean structure and rich correlation induced by numerous random effects in hierarchical migration data. We show that a penalized spline estimator admit a Bayesian hierarchical representation, which in turn yields sufficient statistics that preserve the full likelihood contribution of each node when communication is unconstrained; aggregating these summaries reproduces the centralized estimator exactly. Under communication constraints, the same summaries define a surrogate likelihood enabling one-step estimation with first-order statistical efficiency. The framework also incorporates an SVD-enhanced implementation to ensure numerical stability and scalability, extending applicability to settings with many random effects, with or without communication limits. Statistical and theoretical guarantees are provided. Extensive simulations confirm the accuracy and robustness of the method. An application to U.S. migration flow data demonstrates its ability to efficiently and precisely uncover dynamic spatial patterns.
Similar Papers
Efficient bayesian spatially varying coefficients modeling for censored data using the vecchia approximation
Methodology
Maps pollution better, even with missing data.
A Scalable Variational Bayes Approach for Fitting Non-Conjugate Spatial Generalized Linear Mixed Models via Basis Expansions
Methodology
Lets computers quickly learn from big, messy data.
Scalable Bayesian inference for high-dimensional mixed-type multivariate spatial data
Methodology
Models different kinds of data together in places.