Score: 1

A Latency-Constrained, Gated Recurrent Unit (GRU) Implementation in the Versal AI Engine

Published: November 19, 2025 | arXiv ID: 2511.15626v1

By: M. Sapkas, A. Triossi, M. Zanetti

Potential Business Impact:

Speeds up smart computer brains for fast tasks.

Business Areas:

Intelligent Systems Artificial Intelligence, Data and Analytics, Science and Engineering

This work explores the use of the AMD Xilinx Versal Adaptable Intelligent Engine(AIE) to accelerate Gated Recurrent Unit (GRU) inference for latency-Constrained applications. We present a custom workload distribution framework across the AIE's vector processors and propose a hybrid AIE - Programmable Logic (PL) design to optimize computational efficiency. Our approach highlights the potential of deploying adaptable neural networks in real-time environments such as online preprocessing in the readout chain of a physics experiment, offering a flexible alternative to traditional fixed-function algorithms.

AIE4ML: An End-to-End Framework for Compiling Neural Networks for the Next Generation of AMD AI Engines

Machine Learning (CS)

Makes AI run super fast on special chips.

17 Dec 2025 0

86%

Integrated GARCH-GRU in Financial Volatility Forecasting

Statistical Finance

Predicts stock market ups and downs better.

13 Apr 2025 0

86%

A Scalable FPGA Architecture With Adaptive Memory Utilization for GEMM-Based Operations

Hardware Architecture

Makes AI learn faster and use less power.

9 Oct 2025 0

View PDF Login to Bookmark

Country of Origin

🇮🇹 Italy

Page Count

7 pages

A Latency-Constrained, Gated Recurrent Unit (GRU) Implementation in the Versal AI Engine

Speeds up smart computer brains for fast tasks.

Technical Abstract

AIE4ML: An End-to-End Framework for Compiling Neural Networks for the Next Generation of AMD AI Engines

Integrated GARCH-GRU in Financial Volatility Forecasting

A Scalable FPGA Architecture With Adaptive Memory Utilization for GEMM-Based Operations