Score: 1

Breaking Self-Attention Failure: Rethinking Query Initialization for Infrared Small Target Detection

Published: January 6, 2026 | arXiv ID: 2601.02837v1

By: Yuteng Liu , Duanni Meng , Maoxun Yuan and more

Potential Business Impact:

Finds tiny heat spots in blurry pictures.

Business Areas:
Intrusion Detection Information Technology, Privacy and Security

Infrared small target detection (IRSTD) faces significant challenges due to the low signal-to-noise ratio (SNR), small target size, and complex cluttered backgrounds. Although recent DETR-based detectors benefit from global context modeling, they exhibit notable performance degradation on IRSTD. We revisit this phenomenon and reveal that the target-relevant embeddings of IRST are inevitably overwhelmed by dominant background features due to the self-attention mechanism, leading to unreliable query initialization and inaccurate target localization. To address this issue, we propose SEF-DETR, a novel framework that refines query initialization for IRSTD. Specifically, SEF-DETR consists of three components: Frequency-guided Patch Screening (FPS), Dynamic Embedding Enhancement (DEE), and Reliability-Consistency-aware Fusion (RCF). The FPS module leverages the Fourier spectrum of local patches to construct a target-relevant density map, suppressing background-dominated features. DEE strengthens multi-scale representations in a target-aware manner, while RCF further refines object queries by enforcing spatial-frequency consistency and reliability. Extensive experiments on three public IRSTD datasets demonstrate that SEF-DETR achieves superior detection performance compared to state-of-the-art methods, delivering a robust and efficient solution for infrared small target detection task.

Country of Origin
🇨🇳 China

Page Count
10 pages

Category
Computer Science:
CV and Pattern Recognition