Unleashing Temporal Capacity of Spiking Neural Networks through Spatiotemporal Separation
By: Yiting Dong , Zhaofei Yu , Jianhao Ding and more
Spiking Neural Networks (SNNs) are considered naturally suited for temporal processing, with membrane potential propagation widely regarded as the core temporal modeling mechanism. However, existing research lack analysis of its actual contributions in complex temporal tasks. We design Non-Stateful (NS) models progressively removing membrane propagation to quantify its stage-wise role. Experiments reveal a counterintuitive phenomenon: moderate removal in shallow or deep layers improves performance, while excessive removal causes collapse. We attribute this to spatio-temporal resource competition where neurons encode both semantics and dynamics within limited range, with temporal state consuming capacity for spatial learning. Based on this, we propose Spatial-Temporal Separable Network (STSep), decoupling residual blocks into independent spatial and temporal branches. The spatial branch focuses on semantic extraction while the temporal branch captures motion through explicit temporal differences. Experiments on Something-Something V2, UCF101, and HMDB51 show STSep achieves superior performance, with retrieval task and attention analysis confirming focus on motion rather than static appearance. This work provides new perspectives on SNNs' temporal mechanisms and an effective solution for spatiotemporal modeling in video understanding.
Similar Papers
Learning Scalable Temporal Representations in Spiking Neural Networks Without Labels
Emerging Technologies
Teaches computers to learn from pictures without labels.
All in one timestep: Enhancing Sparsity and Energy efficiency in Multi-level Spiking Neural Networks
Neural and Evolutionary Computing
Makes computer brains use less power for thinking.
Hybrid Temporal-8-Bit Spike Coding for Spiking Neural Network Surrogate Training
Neural and Evolutionary Computing
Makes AI see better with less power.