A Latency-Constrained, Gated Recurrent Unit (GRU) Implementation in the Versal AI Engine
By: M. Sapkas, A. Triossi, M. Zanetti
Potential Business Impact:
Speeds up smart computer brains for fast tasks.
This work explores the use of the AMD Xilinx Versal Adaptable Intelligent Engine(AIE) to accelerate Gated Recurrent Unit (GRU) inference for latency-Constrained applications. We present a custom workload distribution framework across the AIE's vector processors and propose a hybrid AIE - Programmable Logic (PL) design to optimize computational efficiency. Our approach highlights the potential of deploying adaptable neural networks in real-time environments such as online preprocessing in the readout chain of a physics experiment, offering a flexible alternative to traditional fixed-function algorithms.
Similar Papers
AIE4ML: An End-to-End Framework for Compiling Neural Networks for the Next Generation of AMD AI Engines
Machine Learning (CS)
Makes AI run super fast on special chips.
Integrated GARCH-GRU in Financial Volatility Forecasting
Statistical Finance
Predicts stock market ups and downs better.
A Scalable FPGA Architecture With Adaptive Memory Utilization for GEMM-Based Operations
Hardware Architecture
Makes AI learn faster and use less power.