Real Time FPGA Based CNNs for Detection, Classification, and Tracking in Autonomous Systems: State of the Art Designs and Optimizations
By: Safa Mohammed Sali, Mahmoud Meribout, Ashiyana Abdul Majeed
Potential Business Impact:
Makes cameras understand things faster and with less power.
This paper presents a comprehensive review of recent advances in deploying convolutional neural networks (CNNs) for object detection, classification, and tracking on Field Programmable Gate Arrays (FPGAs). With the increasing demand for real-time computer vision applications in domains such as autonomous vehicles, robotics, and surveillance, FPGAs have emerged as a powerful alternative to GPUs and ASICs due to their reconfigurability, low power consumption, and deterministic latency. We critically examine state-of-the-art FPGA implementations of CNN-based vision tasks, covering algorithmic innovations, hardware acceleration techniques, and the integration of optimization strategies like pruning, quantization, and sparsity-aware methods to maximize performance within hardware constraints. This survey also explores the landscape of modern FPGA platforms, including classical LUT-DSP based architectures, System-on-Chip (SoC) FPGAs, and Adaptive Compute Acceleration Platforms (ACAPs), comparing their capabilities in handling deep learning workloads. Furthermore, we review available software development tools such as Vitis AI, FINN, and Intel FPGA AI Suite, which significantly streamline the design and deployment of AI models on FPGAs. The paper uniquely discusses hybrid architecture that combine GPUs and FPGAs for collaborative acceleration of AI inference, addressing challenges related to energy efficiency and throughput. Additionally, we highlight hardware-software co-design practices, dataflow optimizations, and pipelined processing techniques essential for real-time inference on resource-constrained devices. Through this survey, researchers and engineers are equipped with insights to develop next-generation, power-efficient, and high-performance vision systems optimized for FPGA deployment in edge and embedded applications.
Similar Papers
A Resource-Driven Approach for Implementing CNNs on FPGAs Using Adaptive IPs
Hardware Architecture
Makes AI run faster on small chips.
Real-time Object Detection and Associated Hardware Accelerators Targeting Autonomous Vehicles: A Review
Hardware Architecture
Helps self-driving cars see faster and safer.
Real Time FPGA Based Transformers & VLMs for Vision Tasks: SOTA Designs and Optimizations
Hardware Architecture
Makes smart AI run faster on small devices.