STM3: Mixture of Multiscale Mamba for Long-Term Spatio-Temporal Time-Series Prediction
By: Haolong Chen , Liang Zhang , Zhengyuan Xin and more
Potential Business Impact:
Predicts future events by seeing patterns in time.
Recently, spatio-temporal time-series prediction has developed rapidly, yet existing deep learning methods struggle with learning complex long-term spatio-temporal dependencies efficiently. The long-term spatio-temporal dependency learning brings two new challenges: 1) The long-term temporal sequence includes multiscale information naturally which is hard to extract efficiently; 2) The multiscale temporal information from different nodes is highly correlated and hard to model. To address these challenges, we propose an efficient \textit{\textbf{S}patio-\textbf{T}emporal \textbf{M}ultiscale \textbf{M}amba} (STM2) that includes a multiscale Mamba architecture to capture the multiscale information efficiently and simultaneously, and an adaptive graph causal convolution network to learn the complex multiscale spatio-temporal dependency. STM2 includes hierarchical information aggregation for different-scale information that guarantees their distinguishability. To capture diverse temporal dynamics across all spatial nodes more efficiently, we further propose an enhanced version termed \textit{\textbf{S}patio-\textbf{T}emporal \textbf{M}ixture of \textbf{M}ultiscale \textbf{M}amba} (STM3) that employs a special Mixture-of-Experts architecture, including a more stable routing strategy and a causal contrastive learning strategy to enhance the scale distinguishability. We prove that STM3 has much better routing smoothness and guarantees the pattern disentanglement for each expert successfully. Extensive experiments on real-world benchmarks demonstrate STM2/STM3's superior performance, achieving state-of-the-art results in long-term spatio-temporal time-series prediction.
Similar Papers
ms-Mamba: Multi-scale Mamba for Time-Series Forecasting
Machine Learning (CS)
Predicts future events better by looking at different time speeds.
HiSTM: Hierarchical Spatiotemporal Mamba for Cellular Traffic Forecasting
Networking and Internet Architecture
Predicts phone network traffic better, faster.
SAMBA: Toward a Long-Context EEG Foundation Model via Spatial Embedding and Differential Mamba
Machine Learning (CS)
Helps computers understand brain signals better.