Score: 0

Balanced Hierarchical Contrastive Learning with Decoupled Queries for Fine-grained Object Detection in Remote Sensing Images

Published: December 30, 2025 | arXiv ID: 2512.24074v1

By: Jingzhou Chen , Dexin Chen , Fengchao Xiong and more

Fine-grained remote sensing datasets often use hierarchical label structures to differentiate objects in a coarse-to-fine manner, with each object annotated across multiple levels. However, embedding this semantic hierarchy into the representation learning space to improve fine-grained detection performance remains challenging. Previous studies have applied supervised contrastive learning at different hierarchical levels to group objects under the same parent class while distinguishing sibling subcategories. Nevertheless, they overlook two critical issues: (1) imbalanced data distribution across the label hierarchy causes high-frequency classes to dominate the learning process, and (2) learning semantic relationships among categories interferes with class-agnostic localization. To address these issues, we propose a balanced hierarchical contrastive loss combined with a decoupled learning strategy within the detection transformer (DETR) framework. The proposed loss introduces learnable class prototypes and equilibrates gradients contributed by different classes at each hierarchical level, ensuring that each hierarchical class contributes equally to the loss computation in every mini-batch. The decoupled strategy separates DETR's object queries into classification and localization sets, enabling task-specific feature extraction and optimization. Experiments on three fine-grained datasets with hierarchical annotations demonstrate that our method outperforms state-of-the-art approaches.

Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery

CV and Pattern Recognition

Finds small things in satellite pictures better.

1 Dec 2025 1

89%

Late-decoupled 3D Hierarchical Semantic Segmentation with Semantic Prototype Discrimination based Bi-branch Supervision

CV and Pattern Recognition

Helps robots understand 3D spaces better.

20 Nov 2025 1

88%

Multi-Object Grounding via Hierarchical Contrastive Siamese Transformers

CV and Pattern Recognition

Finds many things in 3D space from words.

14 Apr 2025 1

View PDF Login to Bookmark

Balanced Hierarchical Contrastive Learning with Decoupled Queries for Fine-grained Object Detection in Remote Sensing Images

Technical Abstract

Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery

Late-decoupled 3D Hierarchical Semantic Segmentation with Semantic Prototype Discrimination based Bi-branch Supervision

Multi-Object Grounding via Hierarchical Contrastive Siamese Transformers