Score: 1

High Utilization Energy-Aware Real-Time Inference Deep Convolutional Neural Network Accelerator

Published: September 6, 2025 | arXiv ID: 2509.05688v1

By: Kuan-Ting Lin , Ching-Te Chiu , Jheng-Yi Chang and more

Potential Business Impact:

Makes smart cameras work faster and use less power.

Business Areas:
Image Recognition Data and Analytics, Software

Deep convolution Neural Network (DCNN) has been widely used in computer vision tasks. However, for edge devices even inference has too large computational complexity and data access amount. The inference latency of state-of-the-art models are impractical for real-world applications. In this paper, we propose a high utilization energy-aware real-time inference deep convolutional neural network accelerator, which improves the performance of the current accelerators. First, we use the 1x1 size convolution kernel as the smallest unit of the computing unit. Then we design suitable computing unit based on the requirements of each model. Secondly, we use Reuse Feature SRAM to store the output of the current layer in the chip and use the value as the input of the next layer. Moreover, we import Output Reuse Strategy and Ring Stream Dataflow to reduce the amount of data exchange between chips and DRAM. Finally, we present On-fly Pooling Module to let the calculation of the Pooling layer directly complete in the chip. With the aid of the proposed method, the implemented acceleration chip has an extremely high hardware utilization rate. We reduce a generous amount of data transfer on the specific module, ECNN. Compared to the methods without reuse strategy, we can reduce 533 times of data access amount. At the same time, we have enough computing power to perform real-time execution of the existing image classification model, VGG16 and MobileNet. Compared with the design in VWA, we can speed up 7.52 times and have 1.92x energy efficiency

Country of Origin
🇹🇼 Taiwan, Province of China

Page Count
13 pages

Category
Computer Science:
Hardware Architecture