ORCA: Object Recognition and Comprehension for Archiving Marine Species
By: Yuk-Kwan Wong , Haixin Liang , Zeyu Ma and more
Marine visual understanding is essential for monitoring and protecting marine ecosystems, enabling automatic and scalable biological surveys. However, progress is hindered by limited training data and the lack of a systematic task formulation that aligns domain-specific marine challenges with well-defined computer vision tasks, thereby limiting effective model application. To address this gap, we present ORCA, a multi-modal benchmark for marine research comprising 14,647 images from 478 species, with 42,217 bounding box annotations and 22,321 expert-verified instance captions. The dataset provides fine-grained visual and textual annotations that capture morphology-oriented attributes across diverse marine species. To catalyze methodological advances, we evaluate 18 state-of-the-art models on three tasks: object detection (closed-set and open-vocabulary), instance captioning, and visual grounding. Results highlight key challenges, including species diversity, morphological overlap, and specialized domain demands, underscoring the difficulty of marine understanding. ORCA thus establishes a comprehensive benchmark to advance research in marine domain. Project Page: http://orca.hkustvgd.com/.
Similar Papers
MSC: A Marine Wildlife Video Dataset with Grounded Segmentation and Clip-Level Captioning
CV and Pattern Recognition
Helps computers understand and describe underwater videos.
MSC: A Marine Wildlife Video Dataset with Grounded Segmentation and Clip-Level Captioning
CV and Pattern Recognition
Helps computers understand and describe ocean videos.
MSC: A Marine Wildlife Video Dataset with Grounded Segmentation and Clip-Level Captioning
CV and Pattern Recognition
Helps computers understand ocean videos and marine life.