Hardware-aware Neural Architecture Search of Early Exiting Networks on Edge Accelerators
By: Alaa Zniber , Arne Symons , Ouassim Karrakchou and more
Potential Business Impact:
Makes smart devices run faster using less power.
Advancements in high-performance computing and cloud technologies have enabled the development of increasingly sophisticated Deep Learning (DL) models. However, the growing demand for embedded intelligence at the edge imposes stringent computational and energy constraints, challenging the deployment of these large-scale models. Early Exiting Neural Networks (EENN) have emerged as a promising solution, allowing dynamic termination of inference based on input complexity to enhance efficiency. Despite their potential, EENN performance is highly influenced by the heterogeneity of edge accelerators and the constraints imposed by quantization, affecting accuracy, energy efficiency, and latency. Yet, research on the automatic optimization of EENN design for edge hardware remains limited. To bridge this gap, we propose a hardware-aware Neural Architecture Search (NAS) framework that systematically integrates the effects of quantization and hardware resource allocation to optimize the placement of early exit points within a network backbone. Experimental results on the CIFAR-10 dataset demonstrate that our NAS framework can discover architectures that achieve over a 50\% reduction in computational costs compared to conventional static networks, making them more suitable for deployment in resource-constrained edge environments.
Similar Papers
ONNX-Net: Towards Universal Representations and Instant Performance Prediction for Neural Architectures
Machine Learning (CS)
Tests computer brains instantly, no matter their design.
Flexible Vector Integration in Embedded RISC-V SoCs for End to End CNN Inference Acceleration
Distributed, Parallel, and Cluster Computing
Makes smart devices run AI faster and use less power.
Spiking Neural Network Architecture Search: A Survey
Neural and Evolutionary Computing
Makes smart chips use less power for thinking.