Assessing and Refining ChatGPT's Performance in Identifying Targeting and Inappropriate Language: A Comparative Study
By: Barbarestani Baran, Maks Isa, Vossen Piek
Potential Business Impact:
AI spots bad online words better.
This study evaluates the effectiveness of ChatGPT, an advanced AI model for natural language processing, in identifying targeting and inappropriate language in online comments. With the increasing challenge of moderating vast volumes of user-generated content on social network sites, the role of AI in content moderation has gained prominence. We compared ChatGPT's performance against crowd-sourced annotations and expert evaluations to assess its accuracy, scope of detection, and consistency. Our findings highlight that ChatGPT performs well in detecting inappropriate content, showing notable improvements in accuracy through iterative refinements, particularly in Version 6. However, its performance in targeting language detection showed variability, with higher false positive rates compared to expert judgments. This study contributes to the field by demonstrating the potential of AI models like ChatGPT to enhance automated content moderation systems while also identifying areas for further improvement. The results underscore the importance of continuous model refinement and contextual understanding to better support automated moderation and mitigate harmful online behavior.
Similar Papers
Understanding and Analyzing Inappropriately Targeting Language in Online Discourse: A Comparative Annotation Study
Computation and Language
Finds mean online words to make internet safer.
A Comparison of Human and ChatGPT Classification Performance on Complex Social Media Data
Computation and Language
AI struggles to understand tricky words.
On Assessing the Relevance of Code Reviews Authored by Generative Models
Software Engineering
AI writes better code reviews than people.