Split-and-Conquer: Distributed Factor Modeling for High-Dimensional Matrix-Variate Time Series
By: Hangjin Jiang, Yuzhou Li, Zhaoxing Gao
Potential Business Impact:
Shrinks big, messy data for faster, smarter analysis.
In this paper, we propose a distributed framework for reducing the dimensionality of high-dimensional, large-scale, heterogeneous matrix-variate time series data using a factor model. The data are first partitioned column-wise (or row-wise) and allocated to node servers, where each node estimates the row (or column) loading matrix via two-dimensional tensor PCA. These local estimates are then transmitted to a central server and aggregated, followed by a final PCA step to obtain the global row (or column) loading matrix estimator. Given the estimated loading matrices, the corresponding factor matrices are subsequently computed. Unlike existing distributed approaches, our framework preserves the latent matrix structure, thereby improving computational efficiency and enhancing information utilization. We also discuss row- and column-wise clustering procedures for settings in which the group memberships are unknown. Furthermore, we extend the analysis to unit-root nonstationary matrix-variate time series. Asymptotic properties of the proposed method are derived for the diverging dimension of the data in each computing unit and the sample size $T$. Simulation results assess the computational efficiency and estimation accuracy of the proposed framework, and real data applications further validate its predictive performance.
Similar Papers
Factor Modelling for Biclustering Large-dimensional Matrix-valued Time Series
Methodology
Finds hidden patterns in complex data sets.
Sparse-Group Factor Analysis for High-Dimensional Time Series
Methodology
Makes complex data easier to understand.
Inference in matrix-valued time series with common stochastic trends and multifactor error structure
Methodology
Finds hidden patterns in complex data streams.