Score: 0

ASTER: Attention-based Spiking Transformer Engine for Event-driven Reasoning

Published: November 10, 2025 | arXiv ID: 2511.06770v1

By: Tamoghno Das , Khanh Phan Vu , Hanning Chen and more

Potential Business Impact:

Makes smart cameras use less power to see.

Business Areas:

Intelligent Systems Artificial Intelligence, Data and Analytics, Science and Engineering

The integration of spiking neural networks (SNNs) with transformer-based architectures has opened new opportunities for bio-inspired low-power, event-driven visual reasoning on edge devices. However, the high temporal resolution and binary nature of spike-driven computation introduce architectural mismatches with conventional digital hardware (CPU/GPU). Prior neuromorphic and Processing-in-Memory (PIM) accelerators struggle with high sparsity and complex operations prevalent in such models. To address these challenges, we propose a memory-centric hardware accelerator tailored for spiking transformers, optimized for deployment in real-time event-driven frameworks such as classification with both static and event-based input frames. Our design leverages a hybrid analog-digital PIM architecture with input sparsity optimizations, and a custom-designed dataflow to minimize memory access overhead and maximize data reuse under spatiotemporal sparsity, for compute and memory-efficient end-to-end execution of spiking transformers. We subsequently propose inference-time software optimizations for layer skipping, and timestep reduction, leveraging Bayesian Optimization with surrogate modeling to perform robust, efficient co-exploration of the joint algorithmic-microarchitectural design spaces under tight computational budgets. Evaluated on both image(ImageNet) and event-based (CIFAR-10 DVS, DVSGesture) classification, the accelerator achieves up to ~467x and ~1.86x energy reduction compared to edge GPU (Jetson Orin Nano) and previous PIM accelerators for spiking transformers, while maintaining competitive task accuracy on ImageNet dataset. This work enables a new class of intelligent ubiquitous edge AI, built using spiking transformer acceleration for low-power, real-time visual processing at the extreme edge.

NEURAL: An Elastic Neuromorphic Architecture with Hybrid Data-Event Execution and On-the-fly Attention Dataflow

Hardware Architecture

Makes computer brains faster and use less power.

18 Sep 2025 0

88%

Attention via Synaptic Plasticity is All You Need: A Biologically Inspired Spiking Neuromorphic Transformer

Neural and Evolutionary Computing

Makes AI use much less power, like a brain.

18 Nov 2025 0

88%

Hardware Efficient Accelerator for Spiking Transformer With Reconfigurable Parallel Time Step Computing

Hardware Architecture

Makes AI brains use less power to think.

25 Mar 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

8 pages

ASTER: Attention-based Spiking Transformer Engine for Event-driven Reasoning

Makes smart cameras use less power to see.

Technical Abstract

NEURAL: An Elastic Neuromorphic Architecture with Hybrid Data-Event Execution and On-the-fly Attention Dataflow

Attention via Synaptic Plasticity is All You Need: A Biologically Inspired Spiking Neuromorphic Transformer

Hardware Efficient Accelerator for Spiking Transformer With Reconfigurable Parallel Time Step Computing