Cumulative Consensus Score: Label-Free and Model-Agnostic Evaluation of Object Detectors in Deployment
By: Avinaash Manoharan , Xiangyu Yin , Domenik Helm and more
Potential Business Impact:
Checks if computer vision sees things right.
Evaluating object detection models in deployment is challenging because ground-truth annotations are rarely available. We introduce the Cumulative Consensus Score (CCS), a label-free metric that enables continuous monitoring and comparison of detectors in real-world settings. CCS applies test-time data augmentation to each image, collects predicted bounding boxes across augmented views, and computes overlaps using Intersection over Union. Maximum overlaps are normalized and averaged across augmentation pairs, yielding a measure of spatial consistency that serves as a proxy for reliability without annotations. In controlled experiments on Open Images and KITTI, CCS achieved over 90% congruence with F1-score, Probabilistic Detection Quality, and Optimal Correction Cost. The method is model-agnostic, working across single-stage and two-stage detectors, and operates at the case level to highlight under-performing scenarios. Altogether, CCS provides a robust foundation for DevOps-style monitoring of object detectors.
Similar Papers
Consistency Change Detection Framework for Unsupervised Remote Sensing Change Detection
CV and Pattern Recognition
Finds changes in Earth pictures automatically.
Automated Model Evaluation for Object Detection via Prediction Consistency and Reliablity
CV and Pattern Recognition
Checks if computer pictures are correct automatically.
CCE: Confidence-Consistency Evaluation for Time Series Anomaly Detection
Machine Learning (CS)
Better checks for weird computer activity.