Score: 1

Focus Through Motion: RGB-Event Collaborative Token Sparsification for Efficient Object Detection

Published: September 4, 2025 | arXiv ID: 2509.03872v1

By: Nan Yang , Yang Wang , Zhanwen Liu and more

Potential Business Impact:

Helps computers see better by focusing on important parts.

Business Areas:

Events Events, Media and Entertainment

Existing RGB-Event detection methods process the low-information regions of both modalities (background in images and non-event regions in event data) uniformly during feature extraction and fusion, resulting in high computational costs and suboptimal performance. To mitigate the computational redundancy during feature extraction, researchers have respectively proposed token sparsification methods for the image and event modalities. However, these methods employ a fixed number or threshold for token selection, hindering the retention of informative tokens for samples with varying complexity. To achieve a better balance between accuracy and efficiency, we propose FocusMamba, which performs adaptive collaborative sparsification of multimodal features and efficiently integrates complementary information. Specifically, an Event-Guided Multimodal Sparsification (EGMS) strategy is designed to identify and adaptively discard low-information regions within each modality by leveraging scene content changes perceived by the event camera. Based on the sparsification results, a Cross-Modality Focus Fusion (CMFF) module is proposed to effectively capture and integrate complementary features from both modalities. Experiments on the DSEC-Det and PKU-DAVIS-SOD datasets demonstrate that the proposed method achieves superior performance in both accuracy and efficiency compared to existing methods. The code will be available at https://github.com/Zizzzzzzz/FocusMamba.

Beyond conventional vision: RGB-event fusion for robust object detection in dynamic traffic scenarios

CV and Pattern Recognition

Helps cars see better in dark and tunnels.

14 Aug 2025 1

89%

SMamba: Sparse Mamba for Event-based Object Detection

CV and Pattern Recognition

Makes cameras see better with less work.

21 Jan 2025 1

88%

Spatially-guided Temporal Aggregation for Robust Event-RGB Optical Flow Estimation

CV and Pattern Recognition

Makes cameras see fast motion better.

1 Jan 2025 3

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com

Page Count

11 pages

Focus Through Motion: RGB-Event Collaborative Token Sparsification for Efficient Object Detection

Helps computers see better by focusing on important parts.

Technical Abstract

Beyond conventional vision: RGB-event fusion for robust object detection in dynamic traffic scenarios

SMamba: Sparse Mamba for Event-based Object Detection

Spatially-guided Temporal Aggregation for Robust Event-RGB Optical Flow Estimation