Computing-In-Memory Dataflow for Minimal Buffer Traffic
By: Choongseok Song, Doo Seok Jeong
Potential Business Impact:
Makes AI chips faster and use less power.
Computing-In-Memory (CIM) offers a potential solution to the memory wall issue and can achieve high energy efficiency by minimizing data movement, making it a promising architecture for edge AI devices. Lightweight models like MobileNet and EfficientNet, which utilize depthwise convolution for feature extraction, have been developed for these devices. However, CIM macros often face challenges in accelerating depthwise convolution, including underutilization of CIM memory and heavy buffer traffic. The latter, in particular, has been overlooked despite its significant impact on latency and energy consumption. To address this, we introduce a novel CIM dataflow that significantly reduces buffer traffic by maximizing data reuse and improving memory utilization during depthwise convolution. The proposed dataflow is grounded in solid theoretical principles, fully demonstrated in this paper. When applied to MobileNet and EfficientNet models, our dataflow reduces buffer traffic by 77.4-87.0%, leading to a total reduction in data traffic energy and latency by 10.1-17.9% and 15.6-27.8%, respectively, compared to the baseline (conventional weight-stationary dataflow).
Similar Papers
Computing-In-Memory Aware Model Adaption For Edge Devices
Hardware Architecture
Makes AI chips faster and smaller.
A Time- and Energy-Efficient CNN with Dense Connections on Memristor-Based Chips
Hardware Architecture
Makes AI chips faster and use less power.
RISC-V Based TinyML Accelerator for Depthwise Separable Convolutions in Edge AI
Hardware Architecture
Makes smart devices run faster and use less power.