Score: 1

Self-Supervised Event Representations: Towards Accurate, Real-Time Perception on SoC FPGAs

Published: May 12, 2025 | arXiv ID: 2505.07556v1

By: Kamil Jeziorek, Tomasz Kryjak

Potential Business Impact:

Makes cameras see better in bad light.

Business Areas:

Image Recognition Data and Analytics, Software

Event cameras offer significant advantages over traditional frame-based sensors. These include microsecond temporal resolution, robustness under varying lighting conditions and low power consumption. Nevertheless, the effective processing of their sparse, asynchronous event streams remains challenging. Existing approaches to this problem can be categorised into two distinct groups. The first group involves the direct processing of event data with neural models, such as Spiking Neural Networks or Graph Convolutional Neural Networks. However, this approach is often accompanied by a compromise in terms of qualitative performance. The second group involves the conversion of events into dense representations with handcrafted aggregation functions, which can boost accuracy at the cost of temporal fidelity. This paper introduces a novel Self-Supervised Event Representation (SSER) method leveraging Gated Recurrent Unit (GRU) networks to achieve precise per-pixel encoding of event timestamps and polarities without temporal discretisation. The recurrent layers are trained in a self-supervised manner to maximise the fidelity of event-time encoding. The inference is performed with event representations generated asynchronously, thus ensuring compatibility with high-throughput sensors. The experimental validation demonstrates that SSER outperforms aggregation-based baselines, achieving improvements of 2.4% mAP and 0.6% on the Gen1 and 1 Mpx object detection datasets. Furthermore, the paper presents the first hardware implementation of recurrent representation for event data on a System-on-Chip FPGA, achieving sub-microsecond latency and power consumption between 1-2 W, suitable for real-time, power-efficient applications. Code is available at https://github.com/vision-agh/RecRepEvent.

TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras

CV and Pattern Recognition

Teaches cameras to see events over time.

29 Jul 2025 0

89%

Revealing Latent Information: A Physics-inspired Self-supervised Pre-training Framework for Noisy and Sparse Events

CV and Pattern Recognition

Helps cameras see better in tough conditions.

7 Aug 2025 1

89%

Sparse Convolutional Recurrent Learning for Efficient Event-based Neuromorphic Object Detection

CV and Pattern Recognition

Makes robots see better with less power.

16 Jun 2025 2

View PDF Login to Bookmark

Country of Origin

🇵🇱 Poland

Repos / Data Links

github.com

Page Count

14 pages

Self-Supervised Event Representations: Towards Accurate, Real-Time Perception on SoC FPGAs

Makes cameras see better in bad light.

Technical Abstract

TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras

Revealing Latent Information: A Physics-inspired Self-supervised Pre-training Framework for Noisy and Sparse Events

Sparse Convolutional Recurrent Learning for Efficient Event-based Neuromorphic Object Detection