SDTrack: A Baseline for Event-based Tracking via Spiking Neural Networks
By: Yimeng Shan , Zhenbang Ren , Haodi Wu and more
Potential Business Impact:
Tracks moving things faster and using less power.
Event cameras provide superior temporal resolution, dynamic range, power efficiency, and pixel bandwidth. Spiking Neural Networks (SNNs) naturally complement event data through discrete spike signals, making them ideal for event-based tracking. However, current approaches that combine Artificial Neural Networks (ANNs) and SNNs, along with suboptimal architectures, compromise energy efficiency and limit tracking performance. To address these limitations, we propose the first Transformer-based spike-driven tracking pipeline. Our Global Trajectory Prompt (GTP) method effectively captures global trajectory information and aggregates it with event streams into event images to enhance spatiotemporal representation. We then introduce SDTrack, a Transformer-based spike-driven tracker comprising a Spiking MetaFormer backbone and a tracking head that directly predicts normalized coordinates using spike signals. The framework is end-to-end, does not require data augmentation or post-processing. Extensive experiments demonstrate that SDTrack achieves state-of-the-art performance while maintaining the lowest parameter count and energy consumption across multiple event-based tracking benchmarks, establishing a solid baseline for future research in the field of neuromorphic vision.
Similar Papers
SMTrack: End-to-End Trained Spiking Neural Networks for Multi-Object Tracking in RGB Videos
CV and Pattern Recognition
Tracks many moving things better with less power.
Temporal-Guided Spiking Neural Networks for Event-Based Human Action Recognition
CV and Pattern Recognition
Helps computers see actions from tiny motion changes.
Hybrid Spiking Vision Transformer for Object Detection with Event Cameras
CV and Pattern Recognition
Helps cameras see moving things with less power.