Leveraging Simultaneous Usage of Edge GPU Hardware Engines for Video Face Detection and Recognition
By: Asma Baobaid, Mahmoud Meribout
Potential Business Impact:
Makes cameras find and know faces faster.
Video face detection and recognition in public places at the edge is required in several applications, such as security reinforcement and contactless access to authorized venues. This paper aims to maximize the simultaneous usage of hardware engines available in edge GPUs nowadays by leveraging the concurrency and pipelining of tasks required for face detection and recognition. This also includes the video decoding task, which is required in most face monitoring applications as the video streams are usually carried via Gbps Ethernet network. This constitutes an improvement over previous works where the tasks are usually allocated to a single engine due to the lack of a unified and automated framework that simultaneously explores all hardware engines. In addition, previously, the input faces were usually embedded in still images or within raw video streams that overlook the burst delay caused by the decoding stage. The results on real-life video streams suggest that simultaneously using all the hardware engines available in the recent NVIDIA edge Orin GPU, higher throughput, and a slight saving of power consumption of around 300 mW, accounting for around 5%, have been achieved while satisfying the real-time performance constraint. The performance gets even higher by considering several video streams simultaneously. Further performance improvement could have been obtained if the number of shuffle layers that were created by the tensor RT framework for the face recognition task was lower. Thus, the paper suggests some hardware improvements to the existing edge GPU processors to enhance their performance even higher.
Similar Papers
Edge-GPU Based Face Tracking for Face Detection and Recognition Acceleration
CV and Pattern Recognition
Makes cameras find and know faces faster, cheaper.
Edge GPU Aware Multiple AI Model Pipeline for Accelerated MRI Reconstruction and Analysis
Hardware Architecture
Makes MRI scans and diagnoses much faster.
Boosting performance of computer vision applications through embedded GPUs on the edge
CV and Pattern Recognition
Makes phone apps with cool pictures run faster.