Online Anti-sexist Speech: Identifying Resistance to Gender Bias in Political Discourse
By: Aditi Dutta, Susan Banducci
Potential Business Impact:
Helps computers understand good speech from bad.
Anti-sexist speech, i.e., public expressions that challenge or resist gendered abuse and sexism, plays a vital role in shaping democratic debate online. Yet automated content moderation systems, increasingly powered by large language models (LLMs), may struggle to distinguish such resistance from the sexism it opposes. This study examines how five LLMs classify sexist, anti-sexist, and neutral political tweets from the UK, focusing on high-salience trigger events involving female Members of Parliament in the year 2022. Our analysis show that models frequently misclassify anti-sexist speech as harmful, particularly during politically charged events where rhetorical styles of harm and resistance converge. These errors risk silencing those who challenge sexism, with disproportionate consequences for marginalised voices. We argue that moderation design must move beyond binary harmful/not-harmful schemas, integrate human-in-the-loop review during sensitive events, and explicitly include counter-speech in training data. By linking feminist scholarship, event-based analysis, and model evaluation, this work highlights the sociotechnical challenges of safeguarding resistance speech in digital political spaces.
Similar Papers
Demographic Biases and Gaps in the Perception of Sexism in Large Language Models
Computation and Language
Finds sexism, but not everyone's view.
SafeSpeech: A Comprehensive and Interactive Tool for Analysing Sexist and Abusive Language in Conversations
Computation and Language
Finds mean online talk, even when hidden.
Counterspeech for Mitigating the Influence of Media Bias: Comparing Human and LLM-Generated Responses
Computation and Language
Stops mean comments from making news more unfair.