Large Language Models as Span Annotators
By: Zdeněk Kasner , Vilém Zouhar , Patrícia Schmidtová and more
Potential Business Impact:
Computers can now find and label text parts.
Span annotation is the task of localizing and classifying text spans according to custom guidelines. Annotated spans can be used to analyze and evaluate high-quality texts for which single-score metrics fail to provide actionable feedback. Until recently, span annotation was limited to human annotators or fine-tuned models. In this study, we show that large language models (LLMs) can serve as flexible and cost-effective span annotation backbones. To demonstrate their utility, we compare LLMs to skilled human annotators on three diverse span annotation tasks: evaluating data-to-text generation, identifying translation errors, and detecting propaganda techniques. We demonstrate that LLMs achieve inter-annotator agreement (IAA) comparable to human annotators at a fraction of a cost per output annotation. We also manually analyze model outputs, finding that LLMs make errors at a similar rate to human annotators. We release the dataset of more than 40k model and human annotations for further research.
Similar Papers
Linguistic Blind Spots of Large Language Models
Computation and Language
AI struggles to understand sentence parts.
Evaluating Large Language Models as Expert Annotators
Computation and Language
Computers learn to label text like experts.
A Review on Large Language Models for Visual Analytics
Human-Computer Interaction
Lets computers understand pictures and words together.