Route-DETR: Pairwise Query Routing in Transformers for Object Detection
By: Ye Zhang , Qi Chen , Wenyou Huang and more
Potential Business Impact:
Helps computers find objects in pictures faster.
Detection Transformer (DETR) offers an end-to-end solution for object detection by eliminating hand-crafted components like non-maximum suppression. However, DETR suffers from inefficient query competition where multiple queries converge to similar positions, leading to redundant computations. We present Route-DETR, which addresses these issues through adaptive pairwise routing in decoder self-attention layers. Our key insight is distinguishing between competing queries (targeting the same object) versus complementary queries (targeting different objects) using inter-query similarity, confidence scores, and geometry. We introduce dual routing mechanisms: suppressor routes that modulate attention between competing queries to reduce duplication, and delegator routes that encourage exploration of different regions. These are implemented via learnable low-rank attention biases enabling asymmetric query interactions. A dual-branch training strategy incorporates routing biases only during training while preserving standard attention for inference, ensuring no additional computational cost. Experiments on COCO and Cityscapes demonstrate consistent improvements across multiple DETR baselines, achieving +1.7% mAP gain over DINO on ResNet-50 and reaching 57.6% mAP on Swin-L, surpassing prior state-of-the-art models.
Similar Papers
RF-DETR: Neural Architecture Search for Real-Time Detection Transformers
CV and Pattern Recognition
Finds objects in pictures faster and better.
SO-DETR: Leveraging Dual-Domain Features and Knowledge Distillation for Small Object Detection
CV and Pattern Recognition
Finds tiny things in pictures better.
RT-DETR++ for UAV Object Detection
CV and Pattern Recognition
Finds tiny things in drone pictures faster.