Score: 1

ZeroED: Hybrid Zero-shot Error Detection through Large Language Model Reasoning

Published: April 6, 2025 | arXiv ID: 2504.05345v1

By: Wei Ni , Kaihang Zhang , Xiaoye Miao and more

Potential Business Impact:

Finds mistakes in data without needing many examples.

Business Areas:

Semantic Search Internet Services

Error detection (ED) in tabular data is crucial yet challenging due to diverse error types and the need for contextual understanding. Traditional ED methods often rely heavily on manual criteria and labels, making them labor-intensive. Large language models (LLM) can minimize human effort but struggle with errors requiring a comprehensive understanding of data context. In this paper, we propose ZeroED, a novel hybrid zero-shot error detection framework, which combines LLM reasoning ability with the manual label-based ED pipeline. ZeroED operates in four steps, i.e., feature representation, error labeling, training data construction, and detector training. Initially, to enhance error distinction, ZeroED generates rich data representations using error reason-aware binary features, pre-trained embeddings, and statistical features. Then, ZeroED employs LLM to label errors holistically through in-context learning, guided by a two-step reasoning process for detailed error detection guidelines. To reduce token costs, LLMs are applied only to representative data selected via clustering-based sampling. High-quality training data is constructed through in-cluster label propagation and LLM augmentation with verification. Finally, a classifier is trained to detect all errors. Extensive experiments on seven public datasets demonstrate that, ZeroED substantially outperforms state-of-the-art methods by a maximum 30% improvement in F1 score and up to 90% token cost reduction.

Ensembling LLM-Induced Decision Trees for Explainable and Robust Error Detection

Computation and Language

Helps computers find and fix bad data.

8 Dec 2025 2

88%

A Zero-shot Learning Method Based on Large Language Models for Multi-modal Knowledge Graph Embedding

Artificial Intelligence

Lets computers learn about new things without seeing them.

10 Mar 2025 0

88%

DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning

Computation and Language

Finds events in text without prior examples.

5 Jun 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 🇭🇰 Hong Kong, China

Page Count

14 pages

ZeroED: Hybrid Zero-shot Error Detection through Large Language Model Reasoning

Finds mistakes in data without needing many examples.

Technical Abstract

Ensembling LLM-Induced Decision Trees for Explainable and Robust Error Detection

A Zero-shot Learning Method Based on Large Language Models for Multi-modal Knowledge Graph Embedding

DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning