Towards Cross-Modal Error Detection with Tables and Images
By: Olga Ovcharenko, Sebastian Schelter
Potential Business Impact:
Finds mistakes in data from different sources.
Ensuring data quality at scale remains a persistent challenge for large organizations. Despite recent advances, maintaining accurate and consistent data is still complex, especially when dealing with multiple data modalities. Traditional error detection and correction methods tend to focus on a single modality, typically a table, and often miss cross-modal errors that are common in domains like e-Commerce and healthcare, where image, tabular, and text data co-exist. To address this gap, we take an initial step towards cross-modal error detection in tabular data, by benchmarking several methods. Our evaluation spans four datasets and five baseline approaches. Among them, Cleanlab, a label error detection framework, and DataScope, a data valuation method, perform the best when paired with a strong AutoML framework, achieving the highest F1 scores. Our findings indicate that current methods remain limited, particularly when applied to heavy-tailed real-world data, motivating further research in this area.
Similar Papers
Learning to Detect Label Errors by Making Them: A Method for Segmentation and Object Detection Datasets
Machine Learning (CS)
Finds mistakes in AI image training labels
From Label Error Detection to Correction: A Modular Framework and Benchmark for Object Detection Datasets
CV and Pattern Recognition
Fixes mistakes in computer vision training data.
Uncovering and Mitigating Transient Blindness in Multimodal Model Editing
Machine Learning (CS)
Fixes AI that sees and reads better.