A Scalable FPGA Architecture With Adaptive Memory Utilization for GEMM-Based Operations
By: Anastasios Petropoulos, Theodore Antonakopoulos
Potential Business Impact:
Makes AI learn faster and use less power.
Deep neural network (DNN) inference relies increasingly on specialized hardware for high computational efficiency. This work introduces a field-programmable gate array (FPGA)-based dynamically configurable accelerator featuring systolic arrays, high-bandwidth memory, and UltraRAMs. We present two processing unit (PU) configurations with different computing capabilities using the same interfaces and peripheral blocks. By instantiating multiple PUs and employing a heuristic weight transfer schedule, the architecture achieves notable throughput efficiency over prior works. Moreover, we outline how the architecture can be extended to emulate analog in-memory computing (AIMC) devices to aid next-generation heterogeneous AIMC chip designs and investigate device-level noise behavior. Overall, this brief presents a versatile DNN inference acceleration architecture adaptable to various models and future FPGA designs.
Similar Papers
Instruction-Based Coordination of Heterogeneous Processing Units for Acceleration of DNN Inference
Hardware Architecture
Speeds up AI by making computer chips work together.
FeNN-DMA: A RISC-V SoC for SNN acceleration
Neural and Evolutionary Computing
Makes smart computer brains work faster and use less power.
HePGA: A Heterogeneous Processing-in-Memory based GNN Training Accelerator
Emerging Technologies
Makes computers learn faster and use less power.