Score: 1

TextMamba: Scene Text Detector with Mamba

Published: December 7, 2025 | arXiv ID: 2512.06657v1

By: Qiyan Zhao, Yue Yan, Da-Han Wang

Potential Business Impact:

Helps computers find words in messy pictures.

Business Areas:

Image Recognition Data and Analytics, Software

In scene text detection, Transformer-based methods have addressed the global feature extraction limitations inherent in traditional convolution neural network-based methods. However, most directly rely on native Transformer attention layers as encoders without evaluating their cross-domain limitations and inherent shortcomings: forgetting important information or focusing on irrelevant representations when modeling long-range dependencies for text detection. The recently proposed state space model Mamba has demonstrated better long-range dependencies modeling through a linear complexity selection mechanism. Therefore, we propose a novel scene text detector based on Mamba that integrates the selection mechanism with attention layers, enhancing the encoder's ability to extract relevant information from long sequences. We adopt the Top\_k algorithm to explicitly select key information and reduce the interference of irrelevant information in Mamba modeling. Additionally, we design a dual-scale feed-forward network and an embedding pyramid enhancement module to facilitate high-dimensional hidden state interactions and multi-scale feature fusion. Our method achieves state-of-the-art or competitive performance on various benchmarks, with F-measures of 89.7\%, 89.2\%, and 78.5\% on CTW1500, TotalText, and ICDAR19ArT, respectively. Codes will be available.

PathMamba: A Hybrid Mamba-Transformer for Topologically Coherent Road Segmentation in Satellite Imagery

CV and Pattern Recognition

Maps roads better using faster, smarter computer vision.

26 Nov 2025 2

90%

HybridMamba: A Dual-domain Mamba for 3D Medical Image Segmentation

CV and Pattern Recognition

Helps doctors see inside bodies better.

18 Sep 2025 0

90%

VCMamba: Bridging Convolutions with Multi-Directional Mamba for Efficient Visual Representation

CV and Pattern Recognition

Helps computers see details and the big picture.

4 Sep 2025 1

View PDF Login to Bookmark

Page Count

8 pages

TextMamba: Scene Text Detector with Mamba

Helps computers find words in messy pictures.

Technical Abstract

PathMamba: A Hybrid Mamba-Transformer for Topologically Coherent Road Segmentation in Satellite Imagery

HybridMamba: A Dual-domain Mamba for 3D Medical Image Segmentation

VCMamba: Bridging Convolutions with Multi-Directional Mamba for Efficient Visual Representation