Evaluating the Performance of Open-Vocabulary Object Detection in Low-quality Image
By: Po-Chih Wu
Potential Business Impact:
Helps computers see objects in blurry pictures.
Open-vocabulary object detection enables models to localize and recognize objects beyond a predefined set of categories and is expected to achieve recognition capabilities comparable to human performance. In this study, we aim to evaluate the performance of existing models on open-vocabulary object detection tasks under low-quality image conditions. For this purpose, we introduce a new dataset that simulates low-quality images in the real world. In our evaluation experiment, we find that although open-vocabulary object detection models exhibited no significant decrease in mAP scores under low-level image degradation, the performance of all models dropped sharply under high-level image degradation. OWLv2 models consistently performed better across different types of degradation, while OWL-ViT, GroundingDINO, and Detic showed significant performance declines. We will release our dataset and codes to facilitate future studies.
Similar Papers
ODOV: Towards Open-Domain Open-Vocabulary Object Detection
CV and Pattern Recognition
Helps computers recognize any object anywhere.
Auto-Vocabulary 3D Object Detection
CV and Pattern Recognition
Lets computers find and name objects they've never seen.
Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark
CV and Pattern Recognition
Helps computers see and name new things.