Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study
By: Faeze Ghorbanpour, Daryna Dementieva, Alexander Fraser
Potential Business Impact:
Finds hate speech in many languages.
Despite growing interest in automated hate speech detection, most existing approaches overlook the linguistic diversity of online content. Multilingual instruction-tuned large language models such as LLaMA, Aya, Qwen, and BloomZ offer promising capabilities across languages, but their effectiveness in identifying hate speech through zero-shot and few-shot prompting remains underexplored. This work evaluates LLM prompting-based detection across eight non-English languages, utilizing several prompting techniques and comparing them to fine-tuned encoder models. We show that while zero-shot and few-shot prompting lag behind fine-tuned encoder models on most of the real-world evaluation sets, they achieve better generalization on functional tests for hate speech detection. Our study also reveals that prompt design plays a critical role, with each language often requiring customized prompting techniques to maximize performance.
Similar Papers
System Report for CCL25-Eval Task 10: Prompt-Driven Large Language Model Merge for Fine-Grained Chinese Hate Speech Detection
Computation and Language
Finds hidden hate speech online.
Labels or Input? Rethinking Augmentation in Multimodal Hate Detection
CV and Pattern Recognition
Finds mean memes by looking at pictures and words.
Few-shot Hate Speech Detection Based on the MindSpore Framework
Computation and Language
Finds hate speech with less examples.