FishDetector-R1: Unified MLLM-Based Framework with Reinforcement Fine-Tuning for Weakly Supervised Fish Detection, Segmentation, and Counting
By: Yi Liu , Jingyu Song , Vedanth Kallakuri and more
Potential Business Impact:
Helps scientists count fish better underwater.
Analyzing underwater fish imagery is critical for ecological monitoring but remains difficult due to visual degradation and costly annotations. We introduce FishDetector-R1, a unified MLLM-based framework for fish detection, segmentation, and counting under weak supervision. On the DeepFish dataset, our framework achieves substantial gains over baselines, improving AP by 20% and mIoU by 10%, while reducing MAE by 30% and GAME by 35%. These improvements stem from two key components: a novel detect-to-count prompt that enforces spatially consistent detections and counts, and Reinforcement Learning from Verifiable Reward (RLVR) with a complementary scalable paradigm leveraging sparse point labels. Ablation studies further validate the effectiveness of this reward design. Moreover, the improvement generalizes well to other underwater datasets, confirming strong cross-domain robustness. Overall, FishDetector-R1 provides a reliable and scalable solution for accurate marine visual understanding via weak supervision. The project page for FishDetector-R1 is https://umfieldrobotics.github.io/FishDetector-R1.
Similar Papers
Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in Fisheries
CV and Pattern Recognition
Helps computers identify fish from video automatically.
An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research
CV and Pattern Recognition
Robot finds and reports ocean life automatically.
Deep Learning-Enhanced Visual Monitoring in Hazardous Underwater Environments with a Swarm of Micro-Robots
Robotics
Robots map dangerous places automatically and safely.