Temporal-Guided Spiking Neural Networks for Event-Based Human Action Recognition
By: Siyuan Yang , Shilin Lu , Shizheng Wang and more
Potential Business Impact:
Helps computers see actions from tiny motion changes.
This paper explores the promising interplay between spiking neural networks (SNNs) and event-based cameras for privacy-preserving human action recognition (HAR). The unique feature of event cameras in capturing only the outlines of motion, combined with SNNs' proficiency in processing spatiotemporal data through spikes, establishes a highly synergistic compatibility for event-based HAR. Previous studies, however, have been limited by SNNs' ability to process long-term temporal information, essential for precise HAR. In this paper, we introduce two novel frameworks to address this: temporal segment-based SNN (\textit{TS-SNN}) and 3D convolutional SNN (\textit{3D-SNN}). The \textit{TS-SNN} extracts long-term temporal information by dividing actions into shorter segments, while the \textit{3D-SNN} replaces 2D spatial elements with 3D components to facilitate the transmission of temporal information. To promote further research in event-based HAR, we create a dataset, \textit{FallingDetection-CeleX}, collected using the high-resolution CeleX-V event camera $(1280 \times 800)$, comprising 7 distinct actions. Extensive experimental results show that our proposed frameworks surpass state-of-the-art SNN methods on our newly collected dataset and three other neuromorphic datasets, showcasing their effectiveness in handling long-range temporal information for event-based HAR.
Similar Papers
Spatiotemporal Radar Gesture Recognition with Hybrid Spiking Neural Networks: Balancing Accuracy and Efficiency
Neural and Evolutionary Computing
Saves energy for radar that sees people.
A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential
CV and Pattern Recognition
Recognizes actions without showing faces.
Hybrid Spiking Vision Transformer for Object Detection with Event Cameras
CV and Pattern Recognition
Helps cameras see moving things with less power.