Can NLP Tackle Hate Speech in the Real World? Stakeholder-Informed Feedback and Survey on Counterspeech
By: Tanvi Dinkar , Aiqi Jiang , Simona Frenda and more
Potential Business Impact:
Helps stop online hate speech with community input.
Counterspeech, i.e. the practice of responding to online hate speech, has gained traction in NLP as a promising intervention. While early work emphasised collaboration with non-governmental organisation stakeholders, recent research trends have shifted toward automated pipelines that reuse a small set of legacy datasets, often without input from affected communities. This paper presents a systematic review of 74 NLP studies on counterspeech, analysing the extent to which stakeholder participation influences dataset creation, model development, and evaluation. To complement this analysis, we conducted a participatory case study with five NGOs specialising in online Gender-Based Violence (oGBV), identifying stakeholder-informed practices for counterspeech generation. Our findings reveal a growing disconnect between current NLP research and the needs of communities most impacted by toxic online content. We conclude with concrete recommendations for re-centring stakeholder expertise in counterspeech research.
Similar Papers
Counterspeech for Mitigating the Influence of Media Bias: Comparing Human and LLM-Generated Responses
Computation and Language
Stops mean comments from making news more unfair.
Debunking with Dialogue? Exploring AI-Generated Counterspeech to Challenge Conspiracy Theories
Computation and Language
AI struggles to fight fake news online.
Beating Harmful Stereotypes Through Facts: RAG-based Counter-speech Generation
Computation and Language
Creates helpful replies to stop online hate.