Score: 1

Lightweight Software Kernels and Hardware Extensions for Efficient Sparse Deep Neural Networks on Microcontrollers

Published: March 8, 2025 | arXiv ID: 2503.06183v2

By: Francesco Daghero , Daniele Jahier Pagliari , Francesco Conti and more

Potential Business Impact:

Makes small computers run smart programs faster.

Business Areas:

RISC Hardware

The acceleration of pruned Deep Neural Networks (DNNs) on edge devices such as Microcontrollers (MCUs) is a challenging task, given the tight area- and power-constraints of these devices. In this work, we propose a three-fold contribution to address this problem. First, we design a set of optimized software kernels for N:M pruned layers, targeting ultra-low-power, multicore RISC-V MCUs, which are up to 2.1x and 3.4x faster than their dense counterparts at 1:8 and 1:16 sparsity, respectively. Then, we implement a lightweight Instruction-Set Architecture (ISA) extension to accelerate the indirect load and non-zero indices decompression operations required by our kernels, obtaining up to 1.9x extra speedup, at the cost of a 5% area overhead. Lastly, we extend an open-source DNN compiler to utilize our sparse kernels for complete networks, showing speedups of 3.21x and 1.81x on a ResNet18 and a Vision Transformer (ViT), with less than 1.5% accuracy drop compared to a dense baseline.

Hardware/Software Co-Design of RISC-V Extensions for Accelerating Sparse DNNs on FPGAs

Machine Learning (CS)

Makes AI run faster by skipping unneeded math.

28 Apr 2025 0

88%

SpikeStream: Accelerating Spiking Neural Network Inference on RISC-V Clusters with Sparse Computation Extensions

Hardware Architecture

Makes brain-like computers faster and use less power.

8 Apr 2025 1

88%

Evaluating the Energy Efficiency of NPU-Accelerated Machine Learning Inference on Embedded Microcontrollers

Emerging Technologies

Makes tiny computers run smart programs faster, cheaper.

22 Sep 2025 0

View PDF Login to Bookmark

Page Count

13 pages

Lightweight Software Kernels and Hardware Extensions for Efficient Sparse Deep Neural Networks on Microcontrollers

Makes small computers run smart programs faster.

Technical Abstract

Hardware/Software Co-Design of RISC-V Extensions for Accelerating Sparse DNNs on FPGAs

SpikeStream: Accelerating Spiking Neural Network Inference on RISC-V Clusters with Sparse Computation Extensions

Evaluating the Energy Efficiency of NPU-Accelerated Machine Learning Inference on Embedded Microcontrollers